basix -an interpreter written in tex - tug

12
BASIX -An Interpreter Written in TEX Andrew Marc Greene MIT Project Athena, E40-342, 77 Massachusetts Avenue, Cambridge, MA 02139 617-253-7788. Internet: [email protected] Abstract An interpreter for the BASIC language is developed entirely in W . The interpreter presents techniques of scanning and parsing that are useful in many contexts where data not containing formatting directives are to be formatted by m. W'S expansion rules are exploited to provide a macro that reads in the rest of the input file as arguments and which never stops expanding. Introduction It is a basic tenet of the T@ faith that rn is Turing-equivalent and that we can write any program in W . It is also widely held that is "the most idiosyncratic language known to us."' This project is an attempt to provide a simple programming front end to W. BASIC was selected because it is a widely used interpreted language. It also features an infix syntax not found in Lisp or POSTSCRIPT. This makes it a more difficult but more general problem than either of these others. The speed of the BASIX interpreter is not impressive. It is not meant to be. The purpose of this interpreter is not to serve as the BASIC implementation of choice. Its purpose is to display useful paradigms of input parsing and advanced W programming. Interaction with Associative arrays. Using \csnane it is possible to implement associative arrays in 7&X. Associative arrays are arrays whose index is not necessarily a number. As an example, if \student has the name of a student, we might look up the student's grade with which would be \grade.Greene in my case. (In the case of \csname, all characters up to the \endcsname are used in the command sequence regardless of their category code.) Ward, Stephen A. and Robert H. Halstead, Jr., Computation Structures, MIT Press, 1990 BASIX makes extensive use of these arrays. Commands are begun with C; functions with F; variables with V; program lines with /; and the linked list of lines with L. This makes it easy for the interpreter to look up the value of any of these things, given the name as perceived by the user. One benefit of \csname.. . \endcsname is that if the resultant command sequence is undefined. ~'FJ replaces it with \relax. This allows us to check, using \ i f x , whether the user has specified a non-existent identifier. This trick is used in exercise 7.7 in The mbook. We use it in BASIX to check for syntax errors and uninitialized variables. Token streams. The BASX interpreter was de- signed to be run interactively. It is called by typing tex basix; the file ends waiting for the first line of BASIC to be entered at W ' s * prompt. This also allows other files to \input basix and immediately follow it with BASIC code. We cannot have the scanner read an entire line at once, since if the last line of basix.tex were a macro that reads a line as a parameter, we'd get a "File ended while scanning use of \getlinen error. Instead, we use a method which at first blush seems more convoluted but which is actually simpler. We note that TEX does not make any dis- tinction between the tokens that make up our interpreter and the tokens that form the BASIC code. The BASIX interpreter is carefully constructed so that each macro ends by calling another macro (which may read parameters). Thus, expansion is never completed, but the interpreter can continue to absorb individual characters that follow it. These characters affect the direction of the expansion; it is this behaviour that allows us to implement a BASIC interpreter in W . TUGboat, Volume 11 (1990), No. 3-Proceedings of the 1990 Annual Meeting 381

Upload: others

Post on 23-Feb-2022

12 views

Category:

Documents


0 download

TRANSCRIPT

BASIX -An Interpreter Written in TEX

Andrew Marc Greene MIT Project Athena, E40-342, 77 Massachusetts Avenue, Cambridge, MA 02139

617-253-7788. Internet: [email protected]

Abstract

An interpreter for the BASIC language is developed entirely in W . The interpreter presents techniques of scanning and parsing that are useful in many contexts where data not containing formatting directives are to be formatted by m. W'S expansion rules are exploited to provide a macro that reads in the rest of the input file as arguments and which never stops expanding.

Introduction

It is a basic tenet of the T@ faith that rn is Turing-equivalent and that we can write any program in W . It is also widely held that is "the most idiosyncratic language known to us."' This project is an attempt to provide a simple programming front end to W.

BASIC was selected because it is a widely used interpreted language. It also features an infix syntax not found in Lisp or POSTSCRIPT. This makes it a more difficult but more general problem than either of these others.

The speed of the BASIX interpreter is not impressive. It is not meant to be. The purpose of this interpreter is not to serve as the BASIC implementation of choice. Its purpose is to display useful paradigms of input parsing and advanced W programming.

Interaction with

Associative arrays. Using \csnane it is possible to implement associative arrays in 7&X. Associative arrays are arrays whose index is not necessarily a number. As an example, if \s tudent has the name of a student, we might look up the student's grade with

which would be \grade.Greene in my case. (In the case of \csname, all characters up to the \endcsname are used in the command sequence regardless of their category code.)

Ward, Stephen A. and Robert H. Halstead, Jr., Computation Structures, MIT Press, 1990

BASIX makes extensive use of these arrays. Commands are begun with C; functions with F; variables with V; program lines with /; and the linked list of lines with L. This makes it easy for the interpreter to look up the value of any of these things, given the name as perceived by the user.

One benefit of \csname.. . \endcsname is that if the resultant command sequence is undefined. ~'FJ replaces it with \ relax. This allows us to check, using \ i fx , whether the user has specified a non-existent identifier. This trick is used in exercise 7.7 in The m b o o k . We use it in BASIX to check for syntax errors and uninitialized variables.

Token streams. The BASX interpreter was de- signed to be run interactively. It is called by typing t e x basix; the file ends waiting for the first line of BASIC to be entered at W ' s * prompt. This also allows other files to \input basix and immediately follow it with BASIC code.

We cannot have the scanner read an entire line at once, since if the last line of bas ix . tex were a macro that reads a line as a parameter, we'd get a "File ended while scanning use of

\ge t l inen error. Instead, we use a method which at first blush seems more convoluted but which is actually simpler.

We note that TEX does not make any dis- tinction between the tokens that make up our interpreter and the tokens that form the BASIC code. The BASIX interpreter is carefully constructed so that each macro ends by calling another macro (which may read parameters). Thus, expansion is never completed, but the interpreter can continue to absorb individual characters that follow it. These characters affect the direction of the expansion; it is this behaviour that allows us to implement a BASIC interpreter in W .

TUGboat, Volume 11 (1990), No. 3-Proceedings of the 1990 Annual Meeting 381

Andrew Marc Greene

Category codes. Normally, TEX distinguishes be- tween sixteen categories of characters. To avoid unwanted side effects, the BASIX interpreter reas-

signs category codes of all non-letters to category 12, "other". One undocumented feature of TJ$ is that it will never let you strand yourself without

an "active" character (which is normally the back-

slash); if you try to reassign the category of your

only active character (with, e.g., \catcode'\\=12),

it will fail silently. To allow us to use every typeable character in BASIX, we first make \catcode31=0,

which then allows us to reassign the catcode of the backslash without stranding ourselves. (Of course,

the BASIX interpreter provides an escape back to TEX which restores the normal category codes.)

We also change the category code of --M, the end-of-line character, to "active". This lets us detect end-of-line errors that may crop up.

Semantics

Words. A word is a collection of one or more

characters that meet requirements based on the first character. The following table describes these

rules using regular expression n ~ t a t i o n . ~ These rules are:

First Regular Character Expression Meaning

[A-Z, a-zl [A-Z, a-z] [A-Z, a-z ,0-9]*$? Identifier LO-91 CO-91+ Integer

I8 V! [^Ill * [ll , -'MI String

Other Symbol

An end-of-line at the end of a string literal is converted to a I' .

Everything in BASIX is one of these four types. Line numbers are integer literals, and both variables

and commands are identifiers.

The \scan macro is used in BASIX to read

the next word. It uses \ f u t u r e l e t to look at the next token and determine whether it should be part

of the current word; ie . , whether it matches the regular expression of the current type. If so, then it

is read in and appended; otherwise \scan returns,

In this notation, [A-Z, a-zl means "any char- acter falling between A and Z or between a and z ,

inclusive." An asterisk means "repeat the preceding

specification as many times as needed, or never." A plus means "repeat the preceeding specification as

many times as needed, at least once." A question mark means "repeat the preceeding specification zero or one times." A dot means "any character."

leaving this next token in the input stream. The word is returned in the macro \word.

The peculiar way \scan operates gives rise to

new problems, however. We can't say

because \scan looks at the tokens which follow

it, which in this case are \lineno=\word. We

need some way to define the goto command so

that the \scan is at the end of the macro; this

will take the next tokens from the input stream.

We therefore have \scan "return" to its caller by

breaking the caller into two parts: the first part ends

with \scan and the second part contains the code which should follow. The second macro is stored in \af terscan, and \scan ends with \af terscan.

As syntactic sugar, \ a f t e r has been defined as

\ l e t \ a f t e r scan . This allows \Cgoto to be coded

as

(Actually, the goto code is slightly more compli- cated than this; but the scanner is the important

point here.) This trick is used throughout the

BASIX interpreter to read in the next tokens from

the user's input without interrupting the expansion

of the W macros that comprise the interpreter.

Expressions. An expression is a sequence of words

that, roughly speaking, alternates between val-

ues and operators. Values fall into one of three

categories: literals, identifiers, and parenthesized

expressions. An operator is one of less-than, more- than, equality, addition, subtraction, multiplication,

division, or reference. (Reference is an implicit op-

erator that is inserted between a function identifier

and its parameters.) Expressions are evaluated in an approach sim-

ilar to that used in the scanner. A word is scanned using \scan and its type is determined.

"Left-hand" values are stored for relatives, addi-

tives, multiplicatives, and references. Using W ' s

grouping operations the evaluator is reentrant, per- mitting parenthesized expressions to be recursively

evaluated.

In order to achieve a functionality similar to

that of \ fu tu re l e t , we exit the evaluator by

\expandafter\aftereval\word, where the macro

\ a f t e r eva l is analogous to \af terscan. Since

\word will contain neither macros nor tokens whose

category codes need changing, this is as good as

\f u tu re l e t .

382 TUGboat, Volume 11 (1990), No. 3-Proceedings of the 1990 Annual Meeting

BASIX - An Interpreter Written in

Structure of the Interpreter

The file basix.tex, which appears as an appendix to this paper. defines all the macros that are needed to run the interpreter. The last line of this file is \endeval, which is usually called when the interpreter has finished evaluating a line. In this case, it calls \enduserline, which, in turn. calls \parseline.

The \parsel ine macro is part of the Program

Parser section of the interpreter. It starts by calling \scan to get the first word of the next line. If \word is an integer literal. it is treated as a line number and the rest of the input line is stored in the appropriate variable without interpretation. If \word is not an integer literal, it is treated as a command.

Each command is treated with an ad hoe

routine near the bottom of basix . tex: however. most of them call on a set of utilities that appear earlier in the file.

Character-string calls. There is a simple library of macros that convert between ASCII codes and character tokens, test for string equality, take subsections of strings, and deal with concatenation.

Debugging definitions. The macro \diw is a debugging-mode-only \immediate\writ e l6 (hence the name \diw). It is toggled by the user commands debug and nodebug.

Expression evaluation. The expression evaluator has a calling structure similar to that of the scanner. Calling routines are split in twain, with

being a prototype of the calling convention. The evaluator will \scan as many words as it can that make sense; in contrast to the scanner, however, it evaluates each instead of merely accumulating them. This process is described above.

Linked list. The BASIX interpreter maintains a linked list of line numbers. The macro

contains the next line number. These macros will follow the linked list (for the l is t command); they also can insert a new line or return the number of the next line.

Program parser. This section contains a number of critical routines. \ eva l l i ne is the macro that does the dispatching based on the user's command. \mandatory specifies what the next character must be (for example, the character after the identifier

in a l e t statement must be =). \parsel ine has already been described.

Syntactic scanner. This is the section containing \scan and its support macros, which are described above.

Type tests. These routines take an argument and determine whether it is an identifier, a string variable, a string literal, an integer literal, a macro, or a digit. The normal way of calling these routines is

These predicates expand into either tt (true) or t f (false). Syntactic sugar is provided in the form of \ i t s t r u e a n d \ i t s f a l s e . \ ifstring\worddoesn't work because of the way matches \ i f and \f i tokens - only \if-style primitives are recognized.

User Utilities. This is the section of the inter- preter in which most of the user commands are defined. Commands are preceded by \C ( e . g . ,

\ C l i s t is the macro called when the user types l i s t ) . Functions are preceded by \F.

Limitations of this implementation

BAW is a minimal BASIC interpreter. There are enough pieces to show how things work, but not enough to do anything practical. Here is a description of the capabilities of this interpreter, so that the reader can play with it. Error recovery is virtually non-existent, so getting the syntax right and not calling non-existent functions is critical.

Entering programs. Lines beginning with an in- teger literal are stored verbatim. Lines are stored in ascending order, and if two or more lines are entered with the same number, only the last is retained.

Immediate commands. Lines not beginning with an integer are executed immediately. Colons are not supported, so only one command may appear on a line. (When a program line is executed, its line number is stripped and the remainder is executed as though it were an immediate command.)

Commands. The following commands are imple- mented in some form: goto, run: l i s t , pr in t , l e t , i f , debug, nodebug, rem. system, ex i t , and s top (but not cont). The interpreter is case sensitive (although with an appropriate application of \up- percase it needn't be: I was lazy), so these must be entered in lowercase.

The following tables list the commands with no parameters, the commands that take one parameter,

TUGboat, Volume 11 (1990), No. 3 -Proceedings of the 1990 Annual Meeting 383

Andrew Marc Greene

and the two commands ( l e t and i f ) that take special forms.

The commands with no parameters are:

run

stop

l i s t

debug

nodebug

rem

system

e x i t

Starts execution at the lowest line number. Does not clear variables. Stops execution immediately. Lists all lines in order. Useful information sent to terminal. Stop debugging mode. Rest of line is ignored. Exit m. Exit BASIX to m.

The commands with one parameter are:

goto Starts execution at the given line number. l i s t Lists the given line. pr in t Displays the given argument.

or to take source code in a given language and pretty-print it.

The definition of a "word" can be changed by modifying the \scan macro. It selects a def- inition for \scantest based on the first char- acter; scantes t is what determines if a given token matches the selected regular expression. The \scantest macro is allowed to redefine itself.

The evaluation of expressions can be extended or changed by modifying the \math code. Floating- point (or even fixed-point) numbers could be dealt with, although the period would need to pass the \d ig i tP test in some cases and not in others.

The method of dealing with newlines is easily removed for languages such as POSTSCRIPT or Lisp for which all whitespace is the same.

(Any of these arguments may be an expression.) Obtaining copies of basix.tex The l e t and i f commands take special forms.

Variable assignments require an explicit l e t com- The Source code to this Paper and the BASI~

mand: interpreter are available by anonymous ftp from gevalt . m i t .edu. which is at IP address 18.72.1.4.

l e t ( identz f ier) = ( e x p ~ e s s i o n ) I will also mail out copies to anyone without ftp

Conditionals do not have an e l s e clause, and goto abilities, is not implie6 by then:

i f (expression) then ( n e w c o m m a n d )

The new command is treated as its own line.

Expressions. Expressions are defined explicitly above. The operators are +, -, *, /, <, =, and >. Parentheses may be used for grouping. Variables may not be referenced before being set. (Unlike in traditional BASIC, variables are not assumed to be 0 if never referenced, and they aren't cleared when run is encountered).

Functions are invoked with

( f u n c t i o n n a m e ) ( ( p a r a m ) , ( p a r a m ) , . . . ) The parameters are implicitly-delimited expressions that are passed to \matheval (which is simply called \eval in the table below to save space). The following functions are defined: l en ( s t r ing) Returns number of characters

in the string. chr$ (expr) Returns the character with the

given ascii value. inc (expr) Returns \eval(expr) + 1. min(expr1, expr2) Returns the lesser of two \evals.

Generalization

The BASIX interpreter can easily be generalized t o serve other needs. These other needs might be to interpret Lisp or POSTSCRIPT code [Anyone want to write a POSTSCRIPT interpreter in m?];

Summary

Using a number of TEX tricks, some more devious than others, a BASIC interpreter can be written in QX. While QX macros will often be less efficient than, for example, auk paired with TEX, solutions using only will be more portable. A less general macro package than BASIX could be written that uses these routines as paradigms and that is very efficient at parsing a specific input format.

TUGboat, Volume 11 (1990), No. 3 -Proceedings of the 1990 Annual Meeting

BASIX -An Interpreter Written in

Listing of basix . t e x 1.---basix.tex begins here.

2. % 3.% BaSiX (with the emphasis on SICK!) by Andrew Marc Greene

4. % 5. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%'rrL%'L%%'rrL%%%%'L%%%%'L%%%%%% 6. % 7. % Andrew's Affiliations

8. % 9. % Copyright (C) 1990 by Andrew Marc Greene

lo. % <amgreeneQmit . edu> 11. % MIT Project Athena 12. % Student Information Processing Board

13. % All rights reserved. 14. % 15. r:rr~'~'~:~:~r~'r~:~'~'~:rr~~r:rr~:r~:r~:r~'rr~:r~~~:r~:~:rrrr~;rrrr~;~;~:r~;rr~:r~ 16. % 17. % BaSiX's Beginnings 18. % 19. \def \f lageol~\catcode13=13}

20.\def\endflageol{\catcode13=5) 21. \def\struncat{\catcode1\$=12)

22. \def\strcat{\catcodel\$=ll}

23. \f lageol\let\eol 24. \endf lageol 25. \newif \ifresult

26. %\newcount \xa\neucount \xb

27. \def \iw(\immediate\uritel6} 28. \def \empty{) 29. \def \gobble#l{)

30. \def \spc{ )

31. \def\itstrue{tt} 32. \def \itsf alse{tf )

33. \def \isnull#l{\resultf alse

34.\expandafter\ifx\csname ernpty#l\endcsname\empty\resulttrue\fi} 35. \newcount \matha\newcount\mathb

36. % 37. %'l.%%%%%%%%%'A'rrL;/,%%%%%%%%%%%%%'L%'L%%%'L%%%%%%%%%%%%'L%'L% 38. % 39.X Character-string Calls

40. % 41. \newcount\strtmp

42.\def\ascii#1{\strtmp1#1)

43.\def\chr#1{\begingroup\uccode65=#l\uppercaseC\gdef\tmp{A)}\endgroup}

44. \def\strlen#l{\strtmp-2% don't count " " \iu tokens 45. \expandafter\if \stringP #l\let\next\strIter\strIter #l\iw\f i)

46. \def \strIter#l{\if x\iw#l\let\next\relax\else\advance\strtmp by l\relax 47. \f i\next}

48.\def\Flen{\expandafter\strlen\expandafter{\Pa}\retun{\nmber\strtmp}} 49. \strcat

50. \def \F$chr${\expandaf ter\chr\expandafter{\Pa}\return{\tmp}} 51. \struncat

52. % first char only: 53. % \def \Fasc{\expandafter\asc\expandaf ter(\)) 54. % 55. %%%%%'crrL%%%%%%%%%%'rrL%'L%%%%'rrX'L%'rrrL%%%%%'rrL%'L%%'L%'L%%'rL% 56. % 57. % Debugging Definitions

58. %

TUGboat, Volume 11 (1990), No. 3-Proceedings of the 1990 Annual Meeting

Andrew Marc Greene

64. %%%%%%%%%%%%%%%%'A%%'A%%%%%%'/A%%'~A%%#%%'L 65. % 66. % Expression Evaluation 67. % 68. \def\expression{\let\afterexpression\afterscan\math) 69. % 70.% (math is a misnomer and should -> expr) 71. % 72. \newcount\parens\newcount\mathParams 73. \def \math{\parens=O\mathParams96\mathInit\matheval)

74.\def\mathRecurse{\advance\parens by l\relax\mathParams96\mathInit\matheval) 75. \def \mathInit{\begingroup 76. \let\mathAcc\empty

77. \let\mathOpRel\empty 78. \let\mathOpAdd\empty

79. \let \mathOpMul\empty 80. \let\mathlhFlef \empty)

81. % 82. \def \matheval{\af ter\mathbranch\scan) 83. % 84.+def\mathbranch{\diw{EXPRESS:\expandafter\noexpand\word:) 85. \let \next \matherr 86.\ifx\empty\uord\let\next\mathHardEnd\else % Expr. end? 87.\expandafter\if\numberP\uord\let\next\matiteral\diw)\fi '/, Number? 88.\expandafter\if\stringP\word\let\next\mathLiteral\fi String literal?

89.\expandafter\if\identifierP\word\let\next\mathIdentifier\fi % Identifier? 90.\expandafter\if\stringvarP\word\let\next\mathIdentifier\fi % :-(

9~.\expandafter\if\macroP\word\let\next\mathMacro\fi % Macro? 9~.\ifx\word\0left\let\next\mat~ecurse\fi % Open paren? 93.\ifx\word\0right\let\next\mathEndRecurse\fi % Close paren? 94.\ifx\uord\0comma\let\next\mathComma\fi % Comma? 95. % 96. % Operator? 97. % 98. \if x\word\Oplus\let\next\math0p\diu{ ! +>\f i 99.\ifx\word\Ominus\let\next\mathOp\diu{!-)\fi

loo.\ifx\word\Otimes\let\next\mathOp\diu{!*)\fi lOl.\ifx\uord\Odiv\let\next\math0p\diw{!/)\fi lO2.\ifx\word\Olt\let\next\mathOp\diw{!<)\fi l03.\ifx\word\Oeq\let\next\math0p\did!=)\fi

l04.\ifx\word\Ogt\let\next\math0p\div(!>)\fi 105. % 106. \f i\next) 107. % 108. \def\Oleft{()\def \Oright{))\def \Ocomrna{,)

109.\def\~plus{+)\def\~minusC-)\def\0timesC+)\def\~div{/~ 110. \def \Olt{<)\def \OeqC=)\def \Ogt{>) 111. % 112.% There's got to be a better way to do the above .... 113. % 114.\def\math~iteral{\diw{MLIT)\ifx\empty\mathAcc\diw{AC~~:\word:) 115. \expandafter\def \expandafter\mathAcc\expandafter 116.~\expandafter\expandafter\expandafter\empty\word) 117. \else

118.\diw{ACC has :\mathAcc: and word is :\word:) 119.\errmessage{Syntax Error: Two values with no operator)\fi\matheval) 120. % 121.% Operator stuff: (Need to add string support / error checking) 122. % 123. \def \mathAdd{\advance\matha by \mathb)

124. \def \mathSub{\advance\matha by -\mathb)

125. \def\mathM~l(\multiply\matha by \mathb) 126. \def \matMiv{\divide\matha by \mathb)

127. \def\mathEQ{\ifnum\matha=\mathb\matha-l\else\mathaO\fi)

386 TUGboat, Volume 11 (1990), No. 3 -Proceedings of the 1990 Annual Meeting

BASIX-An Interpreter Written in TEX

128. \def \mathGT{\ifnum\matha>\mathb\matha-1\else\mathaO\f i) 129. \def\mathLT{\ifnum\matha<\mathb\matha-l\else\mathaO\fi)

130. % 131. \def \mathFlushRel{\mathFlushAdd\if x\empty\mathOpRel\else 132. \matha=\mathlhRel\relax\mathb=\mathAcc\relax\math0pRel 133. \edef\mathAcc{\number\matha~\let\mathOpRel\empty\fi)

134. % 135. \def\mathFlushAdd{\mathFlushMul\ifx\empty\math0pAdd\else

136. \matha=\mathlhAdd\relax\mathb=\mathAcc\relax\math0pAdd 137. \edef \mathAcc{\number\matha)\let\mathOpAdd\empty\f 1)

138. % 139.\def\mathFlushMul{\mathFlushRef\ifx\empty\math~pMu1\e1se 140. \matha=\mathlhMul\relax\mathb=\mathAcc\relax\mathOpMul 141. \edef\mathAcc~\number\matha~\let\math0pMul\empty\fi) 142. % 143. \def\mathFlushRef{\ifx\empty\mathlhRef\else 144. \mathParam

145. \mathlhRef \let\mathlhRef \empty\f i)

146. % 147. \def \mathop{%

148. \if \word+

149. \mathFlushAdd\let\mathlhAdd\mathAcc\let\mathOpAdd\mathAdd\fi

150. \if \word-

151. \mathFlushAdd\let\mathlhAdd\mathAcc\let\mathOpAdd\mathSub\fi 152. \if \word*

153. \mathFlushM~l\let\mathlhMul\mathAcc\let\mathOpMul\mathMul\fi 154. \if \word/

155. \mathFlu~hM~l\let\mathlhMul\mathAcc\let\mathOpMul\mathDiv\fi 156. \if \word=

157. \mathFlu~hRel\let\mathlhRel\mathAcc\let\mathOpRel\mathE~\f i 158. \if \word>

159. \mathFlushRel\let\mathlhRel\mathAcc\let \mathOpRel\mathGT\f i 160. \if \word<

161. \mathFlushRel\let\mathlhRel\mathAcc\let\mathOpRel\mathLT\fi 162. \let\mathAcc\empty

163. \matheval)

164. % 165. \def \mathIdentif ier{%

166.\expandafter\ifx\csname C\word\endcsname\relax

167. \expandaf ter\if x\csname F\word\endcsname\relax

168.\expandafter\ifx\csname V\word\endcsname\relax

169. \let\next\matherr\diw{LOSING:\word:) 170. \else\let\next\mathVariable\f i 171. \else\let\next\mathFunction\f i 172. \else\let\next\mathCornmand\f i\next)

173. % 174.\def\mathVariable{\expandafter\edef\expandafter\word\expandafter 175. {\csname V\word\endcsname)\mathbranch)

176. \def \mathCommand{\expandaf ter\mathHardEnd\word)

177.\def\mathFunction{\expandafter\let\expand~ter\mathlhRef

178. \csname F\word\endcsname\matheval) 179. % 180. \def \mathParam{\advance\mathParams by l\relax\chr\mathParams 181. \diw{PARAM: \tmp: \mathAcc :)

182. \expandaf ter\edef \csname P\tmp\endcsname{\mathAcc)) 183.\def\mathComma{\math~nd\mathParam\mathInit\matheva1)

184. \def\mathEndRecurse{\mathEnd\advance\parens by -l\matheval) 185. \def\mathEnd{\diw{MATHEND: ACC=\mathAcc:)\mathFlushRel 186. \xdef\mathtemp{\mathAcc)\endgroup\edef\mathAcc{\mathtemp))

187. \def \mathHardEnd{\ifnum\parens>O\errmessage{Insuf f i c e closeparens. )\relax

188. \let\next\endeval\else\let\next\mathFinal\fi\next)

189. \def\mathFinal{\mathEnd\let\value\mathAcc\endexpression) 190.\def\matherr{\errmessage~tax error: Unknown symbol \word))

191.\def\endexpression{\afterexpression)

TUGboat, Volume 11 (1990), No. 3 -Proceedings of the 1990 Annual Meeting

Andrew Marc Greene

192. % 193. % ' ~ % % ' ~ % % r ~ ~ ~ % ' ~ % % % ' ~ X ' ~ ~ % ' X ' ~ % ' X ' ~ % % ' ~ ~ ~ ~ % % ' X ' X ' ~ ~ % ' ~ ~ ~ % ' X ' X ' ~ ~ ~ ~ X ' ~ ~ % % % ' ~ ~ % 194. % 195. % Linked List 196. % 197.\def\gotofirstline{\edef\lpointer{\csname LO\endcsname)}

198.\def\foreachline#1{\ifnum\lpointer~99999\edef\word{\lpointer)#~% 199.\edef\lpointer{\csname L\word\endcsname)\foreachline{#l)\fi) 200. % 20l.% gotopast{#l) where #1 is a line number, will set \lpointer to 202. % the least value such that L(lpointer)>#l 203. % 204. \def \gotopast#i{\def \lpointer{O)\def \target{#l)\gotopastloop) 205. % 206. \def \gotopastloop{\edef \tmp{\csname L\lpointer\endcsname)% 207. \ifnum\tmp<\target%

208.\edef\lpointer{\csname L\lpointer\endcsname)%

209.\~et\next=\gotopastloop\else\let\next=\relax\fi 210. \next) 211. % 212. \flageol\def\addLineToLinkedList#l#2 213. {\def#i{#2)\diw{Just stored #2 in \noexpand #I)%

214. % now put it into linked list. . . 215.\expandafter\ifx\csname L\uord\endcsname\relax% if it isn't already there,

216.\gotopast{\word)% \def\lpointer{what-should-point-to-word) 217. \expandafter\edef \csname L\word\endcsname{\csname L\lpointer\endcsname)%

218. \expandaf ter\edef \csname L\lpointer\endcsname{\uord)% 219. \f i\endeval

220. )\endf lageol 221.\expandafter\def\csname LO\endcsname{99999) 222. % 223. %X%%'lh%%'lA%%#%%%%%%'A%%%X%%%%% 224. ;!

225. % Program Parser 226. % 227.\def\evalline{%\iw{EVALLINE :\word:)%

228. \csname C\uord\endcsname) %error-checking? :-)

229. \def\evalerrorC\errmessage{Unkonwn command. Sorry.)) 230. % 231. % \mandatory takes one argument and checks to see if the next 232.X non-whitespace token matches it. If not, an error is generated. 233. % 234. \def \mandatory#l{\def \tmp{#l)\mandatest) 235.\def\mandatest#i{\def\tmpp{#i)\ifx\tmp\tmpp\~et\next\afterscan\e~se

236. \let\next\manderror\f i\next)

237.\def\manderror(\errmessage{\tmpp\spc read when \tmp\spc expected.)% 238. \af terscan) 239. % 240.X \parseline gets the first WORD of the next line. If it's a line 241.X number, \scanandstoreline is called; otherwise the line is executed. 242. % 243. \def \parseline{\after\f irsttest\scan)

244. \def \f irsttest{\expandaf ter\if \numberP\word 245. \let\next\grabandstoreline\else\let\next\evalline\fi\next)

246. \def\grabandstorelineC\diw{Grabbing line \word.)% 247.\expandafter\addLineToLinkedList\csname/\word\endcsname}

248. % 249. %%%%;l,#%%%'l/lA%'llA%%%'lA%'A%%'A%'A%%'A%%%%%'lA%'h%'A%'llA%%%%%%'A%%% 250. % 251. % Syntactic Scanner 252. % 253.X The \scan routine reads the next WORD and then calls \afterscan. 254. ;!

255. % As syntactic sugar, one can write \after\foo to set \afterscan to

388 TUGboat, Volume 11 (1990), No. 3-Proceedings of the 1990 Annual Meeting

BASIX - An Interpreter Written in

256. % \f00.

257. % 258.1 Here are the rules governing WORD. Initial whitespace is

259.X discarded. The word is the next single character, unless that

260. % character is one of the following:

261. % 262. % A-Z or a-z : [A-Z, a-z] [A-Z , a-z ,0-a*\$? 263. % 0-9: [O-9]+

264. % " : $1 [-,,I * t ,

265. % <,=,>: [<=>I [<=>I? (one or two; not the same if two)

266. % 267. % Note that the string literal ignores spaces but may be abnormally

268. % terminated by an end-of-line. (I wasn't sure how to express that 269. % as a regexp) . 270. % 271. % 272. \newif \if scan j: shall we continue scanning?

273. % 274.\def\scan{\def\~ord()\futurelet\~\scanFirst~

275. % 276.\def\scanFirstC% Checks the first character to determine type. 277. \let\next\scanIter

278. \expandafter\if \spc\noexpand\q % Space -- ignore it 279. \let\next\scanSpace\else

280. \if \eol\noexpand\q % End of line -- no word here 281. \let\next\scanEnd\else 282. \if cat A\noexpand\q % Then we have an identifier 283. \let\scanTest\scanIdentifier\else

284. \expandaf ter\if \digit\q 285. \let\scanTest\scanNumericConstant\else

286. \if It\noexpand\q 287. \let\scanTest\scanStringConstant\else

288. \expandaf ter\if \relationP\q 289. \let\scanTest\scanRelation

290. \else 291. \let\scanTest\scanf alse

292. \f i\f i\f i\f i\f i\f i\next}

293. % 294.\def\scanIter#~{\expandafter\def\expandafter\word\expdafter\word #I}

295. \futurelet\q\scanContinuePl 296. \def\scanContinueP{\scanTest\ifscan\let\next\scanIter 297. \else\let\next\scanEnd\fi\next}

298. % 299.\def\scanSpace#l{\scan)% If the first char is a space, gobble it and try again.

300. \def \scanIdentif ierf\if cat A\noexpand\q\scantrue\else 301. \expandafter\if \digit\q\scantrue

302. \else\if $\noexpand\q\scantrue

303. \expandaf ter\def \expandaf ter\word\expandaf ter{\expandaf ter$\word}

304. \let\scanTest\scanf alse\else 305. \scanf alse\f i\f i\f i)

306. \def \scanEndStringf\scanf alse} 307.\def\scdumericConstantI\expandafter\if\digit\q\scantrue\e~se\scanfa~se\fi~

308.\def\scanStringConstant{\scantrue\if"\q\let\scanTest\scanfalse\fi~ 309.\def\sc~elation{\if<\q\scantrue\else\if>\q\scantrue\else\if=\q\scantrue

310. \else\scanf alse\f i\f i\f i} 311. % 3 1 2 . \ d e f \ ~ ~ ~ n d # 1 { \ ~ e l a ~ \ d i ~ { S c A N N E D : \ W 0 r d : }

313. \af terscan #I}% dumps trailing spaces. 314. \def \af ter{\let\af terscafl

315. % 316. %#%%%X%%X%%%%%rA%'A%%%'A%%'b%'A%%'/A%'A%%%%%%%%rk%%r/A%%'L%%'A%%%'/A 317. % 318. % Type Tests (Predicates for type determination) 319. %

TUGboat, Volume 11 (1990), No. 3-Proceedings of the 1990 Annual Meeting

Andrew Marc Greene

32o.\def\relationP#l{tf) % for now, only single-char relations 321. \def \identif ierP#l{\expandaf ter\ident;if ierTest #I\\) 322,\def\identifierTest#l#2\\{\ifcat A#l\itstrue\else\itsfalse\fi) 323. \def \stringvarP#l{\expandaf ter\stringvarTest #I\\)

324.\def\stringvarTest#l#2\\C\if$#l\itstrue\e~se\itsfa1se\fi)

325. \def \stringP#l{\expandafter\stringTest #I\\) 326.\def\stringTest#l#2\\i\if #l"\itstrue\else\itsfalse\fi)

327.\def\numberP#l{\expandafter\numberTest #I\\) 328.\def\numberTest#l#2\\{\expandafter\if\digit #l\itstrue\else\itsfalse\fi)

329. \def \macroP#l{\expandaf ter\macroTest #I\\)

33O.\def\macroTest#~#2\\{\expandafter\ifx #l\relax\itstrue\else\itsfalse\fi)

331. % 332. % \digit tests its single-token argument and returns tt if true,

333. % tf otherwise.

334. % 335. % 336. \def \digit#1{%

337. \if O\noexpand#l\itstrue\else

338. \if l\noexpand#l\itstrue\else 339. \if 2\noexpand#l\itstrue\else

340. \if 3\noexpand#l\it strue\else

341 \if4\noexpand#l\itstrue\else

342. \if 5\noexpand#l\itstrue\else 343. \if 6\noexpand#l\itstrue\else 344. \if 7\noexpand#l\itstrue\else 345. \if 8\noexpand#l\itstrue\else

346.\if9\noexpand#l\itstrue\else\itsfalse 3 4 7 . \ f i \ f i \ f i \ f i \ f i \ f i \ f i \ f i \ f i \ f i

348.%9 8 7 6 5 4 3 2 1 0

349. )

350. % 351. %'l/;/:l/;ll~lA'/;/;/l/;//:/l/1(~ll/;/;/;l~A'/;/:l/;l//;/:/;l/;/~~~ll/;A'l/:~//:/;lll/;~l/:l/;/;/e 352. % 353. % User Utilities. These are the commands that are called by the

354.X user. We could really use a better section name. :-I 355. % 356 % List (one line or all lines, for now)

357. % 358. \def \Clist{\af ter\listmain\scad

359 \def\listmain{\isnul1I\word~\ifresult\let\next\listalllines 360.\else\let\next\listoneline\fi\next)

361. \def\listline{\iw{\word\spc\csname/\word\endcsname)~ 362.\def\listall~ines{\gotofirstline\foreach~ine{\~ist~ine)\endeva~)

363. \def\listoneline{\listline\endeval) 364. % 365. %%%%%%%'lA%%%%%%%'A%'lA%%%'A%%'lA'A%%'lA%'A%%%%%%%'A%%%'/A%'/A%%'A%'L%%%'/llA%%%% 366. % 367.X Different degrees of "stop execution"

368. % 369. \def \Csystem{\end) % exits to the system 370.\def\Cexit{)% \endflageol) % exits to TeX 371. \f lageoS/,

372. \def \Cstop#l

373. {\iu{Stopped in \lineno.)\cleanstop)%

374.\def\cleanstop{\diw~LEANSTOP)\let\endeva1\enduser1ine\endeva1

375. )\endf lageol

376. % 377. %'lA%%%%%%X%X%%%%%%%%'lA%'A%%%'A%%'A%%'A%%'A%%'tA%'tA%%'lA%%%%%'A%'A%%'A%'A%%'A%'lA 378. % 379. % The command "rem" introduces a remark

380. % 381. \def \Crem{\endeval)%

382. % 383. %'l~%%%%'A'~A%%'h' tA%'~%%'A%%%'l~X'A%%%%'l~%'~~%%'A%'t lA%'~lA%%%%'A%%'l~%' lA%%'~%%'t~A

390 TUGboat, Volume 11 (1990), No. 3 -Proceedings of the 1990 Annual Meeting

BASIX - An Interpreter Written in T@

384. % 385.X The "let" command allows variable assignments

386. % 387. \def\Clet{\after\letgetequals\scan) 388. \def\letgetequals{\after\letgetvalue\mandatory{=))

389. \def \letgetvalue{\after\letdoit\expression)

390. \def \letdoit{\expandaf ter\edef \csname V\word\endcsname{\value)% 391. \endeval)

392. % 393. %'h'A'lllr/:A'/:ll/:lll/:lA'rl/:/:l/:lA'/:lh'll/:rlr/;l/:/:/I%'/:lr/;A'/:/:l/:ltlr/:A'A'l/:/:l/:l/:l/, 394. % 395 % The "print" command takes a [list of] expression[s] and displays

396 % it [them].

397. % 398. \def\Cprint{\after\printit\expression) 399.\def\printit{\iw{\value}\endeval)

400. % 401. %X%%%%%%%'A%%X%%;I,%%%%%%%%%%%%%%'A%'cA%% 402. % 403. % The "if " command takes an expression, the word "then," and

404. % another command. If the expression is non-zero, the command is

405.% executed; otherwise it is ignored.

406. % 407.\def\Cif{\after\getift\expression)

408.\def\Cthen{\errmessage{Syntax error: THEN without IF)) 409. \def \getif t{\after\getifh\mandatory t)

410. \def \getifhi\after\getif e\mandatory h)

411. \def\getife{\after\getifn\mandatory e)

412. \def\getifn{\after\consequent\mandatory n)

413. \def \consequent{\ifnum\value=0\let\next=\endeval\else\let\next=\evalconsq\f i 414. \next)

415. \def\evalconsq~\after\evalline\sca~

416. % 417. %%%%%%X%%%;/,%%%%%'rX'A%'/A%%'A%%%%%%'A%%-lA%%%%%%'/A%'A% 418. % 419. % Functions

420. % 421.% Functions may read the counter \mathparams to find out the number

422. % of the top parameter. Parameters are in Pa Pb PC etc. 423. % 424.\def\return{\expandafter\def\expandafter\mathAcc\expandafter) 425. % 426. \def \Finc{\matha=\Pa \advance\matha by 1

427. \return{\{\mnumber\matha)) 428. % 429. \def\Fmin{\ifnum\Pa<\Pb\return{\Pa}\else\return{\Pb)\fi) 430. % 431. %%%%'lA%%X%;/,%%%%%j/,%%%'A'A%%%%%%%%%%%%'!I%%%%'A%%%'A%%'h%%%%%%%%%%%% 432. % 433. % Program execution control

434. % 435.\def\Crun~\let\endeval\endincrline\def\linen0{0)\endeva1) 436. \def\Cgoto{\let\endeval\endgotoline\after\gotomain\scan)

437. \def \gotomain{\edef \lineno{\word)\endeval)

438. % 439. \f lageol%

440. \def\execline~%\message~Executing line \lineno...)% 441. \edef\theline{\csname/\lineno\endcsname)%

442. %\message{THE LINE\theline)%

443. \let\endeval\endincrline\after\evalline\expandaf ter\scan\theline 444. )\endf lageol

445. % 446. % Different varieties of what to do at the end of a command:

447. %

TUGboat, Volume 11 (1990), No. 3-Proceedings of the 1990 Annual Meeting

Andrew Marc Greene

448. % get new line from user (enduserline)

449.X get next line in order (endincrline) 450. % get line in \lineno (endgotoline) 451. % keep parsing current line (endcolonline tbi)

452. % 453. \f lageol% 454. \def \endincrline#l

455. {\diw{ENDINCRLINE)\edef \lineno{\csname L\lineno\endcsname)\execnextline~ 456. % 457. \def \endgotoline#l

458. {\diw{ENDGOTOLINE)\1et\endeval\execincrline\execnextline)% 459. % 460. \def\execnextline(\diw(Ready to execute line \lineno...)%

461. \ifnum\~ineno<99999\~et\next\exec~ine\e~se\~et\next\c~e~stop\fi\next~%

462. % 463. \def \enduserline #1 464. {\diw{ENDUSERLINE)\parseline)\endf lageol

465. % 466. \let\endeval\enduserline

467. % 468. %'A%%%%%%%%%%%%%'/A%'l/A%%%%%%%%%%'/A%'rA%%%'h%%'/A%'rX'A%%'r/rA%%%'rA%%%'A% 469. % 470. % Start your engines! 471. % 472.\iw{This is BaSiX, v0.3, emphasis on the SICK! by amgreeneQmit.edu)

473. \f lageol 474. \catcode32=12

475. \endeval

476. 477.---basix.tex ends here. The blank line at the end is significant.

TUGboat, Volume 11 (1990), No. 3 -Proceedings of the 1990 Annual Meeting