basix -an interpreter written in tex - tug
TRANSCRIPT
BASIX -An Interpreter Written in TEX
Andrew Marc Greene MIT Project Athena, E40-342, 77 Massachusetts Avenue, Cambridge, MA 02139
617-253-7788. Internet: [email protected]
Abstract
An interpreter for the BASIC language is developed entirely in W . The interpreter presents techniques of scanning and parsing that are useful in many contexts where data not containing formatting directives are to be formatted by m. W'S expansion rules are exploited to provide a macro that reads in the rest of the input file as arguments and which never stops expanding.
Introduction
It is a basic tenet of the T@ faith that rn is Turing-equivalent and that we can write any program in W . It is also widely held that is "the most idiosyncratic language known to us."' This project is an attempt to provide a simple programming front end to W.
BASIC was selected because it is a widely used interpreted language. It also features an infix syntax not found in Lisp or POSTSCRIPT. This makes it a more difficult but more general problem than either of these others.
The speed of the BASIX interpreter is not impressive. It is not meant to be. The purpose of this interpreter is not to serve as the BASIC implementation of choice. Its purpose is to display useful paradigms of input parsing and advanced W programming.
Interaction with
Associative arrays. Using \csnane it is possible to implement associative arrays in 7&X. Associative arrays are arrays whose index is not necessarily a number. As an example, if \s tudent has the name of a student, we might look up the student's grade with
which would be \grade.Greene in my case. (In the case of \csname, all characters up to the \endcsname are used in the command sequence regardless of their category code.)
Ward, Stephen A. and Robert H. Halstead, Jr., Computation Structures, MIT Press, 1990
BASIX makes extensive use of these arrays. Commands are begun with C; functions with F; variables with V; program lines with /; and the linked list of lines with L. This makes it easy for the interpreter to look up the value of any of these things, given the name as perceived by the user.
One benefit of \csname.. . \endcsname is that if the resultant command sequence is undefined. ~'FJ replaces it with \ relax. This allows us to check, using \ i fx , whether the user has specified a non-existent identifier. This trick is used in exercise 7.7 in The m b o o k . We use it in BASIX to check for syntax errors and uninitialized variables.
Token streams. The BASX interpreter was de- signed to be run interactively. It is called by typing t e x basix; the file ends waiting for the first line of BASIC to be entered at W ' s * prompt. This also allows other files to \input basix and immediately follow it with BASIC code.
We cannot have the scanner read an entire line at once, since if the last line of bas ix . tex were a macro that reads a line as a parameter, we'd get a "File ended while scanning use of
\ge t l inen error. Instead, we use a method which at first blush seems more convoluted but which is actually simpler.
We note that TEX does not make any dis- tinction between the tokens that make up our interpreter and the tokens that form the BASIC code. The BASIX interpreter is carefully constructed so that each macro ends by calling another macro (which may read parameters). Thus, expansion is never completed, but the interpreter can continue to absorb individual characters that follow it. These characters affect the direction of the expansion; it is this behaviour that allows us to implement a BASIC interpreter in W .
TUGboat, Volume 11 (1990), No. 3-Proceedings of the 1990 Annual Meeting 381
Andrew Marc Greene
Category codes. Normally, TEX distinguishes be- tween sixteen categories of characters. To avoid unwanted side effects, the BASIX interpreter reas-
signs category codes of all non-letters to category 12, "other". One undocumented feature of TJ$ is that it will never let you strand yourself without
an "active" character (which is normally the back-
slash); if you try to reassign the category of your
only active character (with, e.g., \catcode'\\=12),
it will fail silently. To allow us to use every typeable character in BASIX, we first make \catcode31=0,
which then allows us to reassign the catcode of the backslash without stranding ourselves. (Of course,
the BASIX interpreter provides an escape back to TEX which restores the normal category codes.)
We also change the category code of --M, the end-of-line character, to "active". This lets us detect end-of-line errors that may crop up.
Semantics
Words. A word is a collection of one or more
characters that meet requirements based on the first character. The following table describes these
rules using regular expression n ~ t a t i o n . ~ These rules are:
First Regular Character Expression Meaning
[A-Z, a-zl [A-Z, a-z] [A-Z, a-z ,0-9]*$? Identifier LO-91 CO-91+ Integer
I8 V! [^Ill * [ll , -'MI String
Other Symbol
An end-of-line at the end of a string literal is converted to a I' .
Everything in BASIX is one of these four types. Line numbers are integer literals, and both variables
and commands are identifiers.
The \scan macro is used in BASIX to read
the next word. It uses \ f u t u r e l e t to look at the next token and determine whether it should be part
of the current word; ie . , whether it matches the regular expression of the current type. If so, then it
is read in and appended; otherwise \scan returns,
In this notation, [A-Z, a-zl means "any char- acter falling between A and Z or between a and z ,
inclusive." An asterisk means "repeat the preceding
specification as many times as needed, or never." A plus means "repeat the preceeding specification as
many times as needed, at least once." A question mark means "repeat the preceeding specification zero or one times." A dot means "any character."
leaving this next token in the input stream. The word is returned in the macro \word.
The peculiar way \scan operates gives rise to
new problems, however. We can't say
because \scan looks at the tokens which follow
it, which in this case are \lineno=\word. We
need some way to define the goto command so
that the \scan is at the end of the macro; this
will take the next tokens from the input stream.
We therefore have \scan "return" to its caller by
breaking the caller into two parts: the first part ends
with \scan and the second part contains the code which should follow. The second macro is stored in \af terscan, and \scan ends with \af terscan.
As syntactic sugar, \ a f t e r has been defined as
\ l e t \ a f t e r scan . This allows \Cgoto to be coded
as
(Actually, the goto code is slightly more compli- cated than this; but the scanner is the important
point here.) This trick is used throughout the
BASIX interpreter to read in the next tokens from
the user's input without interrupting the expansion
of the W macros that comprise the interpreter.
Expressions. An expression is a sequence of words
that, roughly speaking, alternates between val-
ues and operators. Values fall into one of three
categories: literals, identifiers, and parenthesized
expressions. An operator is one of less-than, more- than, equality, addition, subtraction, multiplication,
division, or reference. (Reference is an implicit op-
erator that is inserted between a function identifier
and its parameters.) Expressions are evaluated in an approach sim-
ilar to that used in the scanner. A word is scanned using \scan and its type is determined.
"Left-hand" values are stored for relatives, addi-
tives, multiplicatives, and references. Using W ' s
grouping operations the evaluator is reentrant, per- mitting parenthesized expressions to be recursively
evaluated.
In order to achieve a functionality similar to
that of \ fu tu re l e t , we exit the evaluator by
\expandafter\aftereval\word, where the macro
\ a f t e r eva l is analogous to \af terscan. Since
\word will contain neither macros nor tokens whose
category codes need changing, this is as good as
\f u tu re l e t .
382 TUGboat, Volume 11 (1990), No. 3-Proceedings of the 1990 Annual Meeting
BASIX - An Interpreter Written in
Structure of the Interpreter
The file basix.tex, which appears as an appendix to this paper. defines all the macros that are needed to run the interpreter. The last line of this file is \endeval, which is usually called when the interpreter has finished evaluating a line. In this case, it calls \enduserline, which, in turn. calls \parseline.
The \parsel ine macro is part of the Program
Parser section of the interpreter. It starts by calling \scan to get the first word of the next line. If \word is an integer literal. it is treated as a line number and the rest of the input line is stored in the appropriate variable without interpretation. If \word is not an integer literal, it is treated as a command.
Each command is treated with an ad hoe
routine near the bottom of basix . tex: however. most of them call on a set of utilities that appear earlier in the file.
Character-string calls. There is a simple library of macros that convert between ASCII codes and character tokens, test for string equality, take subsections of strings, and deal with concatenation.
Debugging definitions. The macro \diw is a debugging-mode-only \immediate\writ e l6 (hence the name \diw). It is toggled by the user commands debug and nodebug.
Expression evaluation. The expression evaluator has a calling structure similar to that of the scanner. Calling routines are split in twain, with
being a prototype of the calling convention. The evaluator will \scan as many words as it can that make sense; in contrast to the scanner, however, it evaluates each instead of merely accumulating them. This process is described above.
Linked list. The BASIX interpreter maintains a linked list of line numbers. The macro
contains the next line number. These macros will follow the linked list (for the l is t command); they also can insert a new line or return the number of the next line.
Program parser. This section contains a number of critical routines. \ eva l l i ne is the macro that does the dispatching based on the user's command. \mandatory specifies what the next character must be (for example, the character after the identifier
in a l e t statement must be =). \parsel ine has already been described.
Syntactic scanner. This is the section containing \scan and its support macros, which are described above.
Type tests. These routines take an argument and determine whether it is an identifier, a string variable, a string literal, an integer literal, a macro, or a digit. The normal way of calling these routines is
These predicates expand into either tt (true) or t f (false). Syntactic sugar is provided in the form of \ i t s t r u e a n d \ i t s f a l s e . \ ifstring\worddoesn't work because of the way matches \ i f and \f i tokens - only \if-style primitives are recognized.
User Utilities. This is the section of the inter- preter in which most of the user commands are defined. Commands are preceded by \C ( e . g . ,
\ C l i s t is the macro called when the user types l i s t ) . Functions are preceded by \F.
Limitations of this implementation
BAW is a minimal BASIC interpreter. There are enough pieces to show how things work, but not enough to do anything practical. Here is a description of the capabilities of this interpreter, so that the reader can play with it. Error recovery is virtually non-existent, so getting the syntax right and not calling non-existent functions is critical.
Entering programs. Lines beginning with an in- teger literal are stored verbatim. Lines are stored in ascending order, and if two or more lines are entered with the same number, only the last is retained.
Immediate commands. Lines not beginning with an integer are executed immediately. Colons are not supported, so only one command may appear on a line. (When a program line is executed, its line number is stripped and the remainder is executed as though it were an immediate command.)
Commands. The following commands are imple- mented in some form: goto, run: l i s t , pr in t , l e t , i f , debug, nodebug, rem. system, ex i t , and s top (but not cont). The interpreter is case sensitive (although with an appropriate application of \up- percase it needn't be: I was lazy), so these must be entered in lowercase.
The following tables list the commands with no parameters, the commands that take one parameter,
TUGboat, Volume 11 (1990), No. 3 -Proceedings of the 1990 Annual Meeting 383
Andrew Marc Greene
and the two commands ( l e t and i f ) that take special forms.
The commands with no parameters are:
run
stop
l i s t
debug
nodebug
rem
system
e x i t
Starts execution at the lowest line number. Does not clear variables. Stops execution immediately. Lists all lines in order. Useful information sent to terminal. Stop debugging mode. Rest of line is ignored. Exit m. Exit BASIX to m.
The commands with one parameter are:
goto Starts execution at the given line number. l i s t Lists the given line. pr in t Displays the given argument.
or to take source code in a given language and pretty-print it.
The definition of a "word" can be changed by modifying the \scan macro. It selects a def- inition for \scantest based on the first char- acter; scantes t is what determines if a given token matches the selected regular expression. The \scantest macro is allowed to redefine itself.
The evaluation of expressions can be extended or changed by modifying the \math code. Floating- point (or even fixed-point) numbers could be dealt with, although the period would need to pass the \d ig i tP test in some cases and not in others.
The method of dealing with newlines is easily removed for languages such as POSTSCRIPT or Lisp for which all whitespace is the same.
(Any of these arguments may be an expression.) Obtaining copies of basix.tex The l e t and i f commands take special forms.
Variable assignments require an explicit l e t com- The Source code to this Paper and the BASI~
mand: interpreter are available by anonymous ftp from gevalt . m i t .edu. which is at IP address 18.72.1.4.
l e t ( identz f ier) = ( e x p ~ e s s i o n ) I will also mail out copies to anyone without ftp
Conditionals do not have an e l s e clause, and goto abilities, is not implie6 by then:
i f (expression) then ( n e w c o m m a n d )
The new command is treated as its own line.
Expressions. Expressions are defined explicitly above. The operators are +, -, *, /, <, =, and >. Parentheses may be used for grouping. Variables may not be referenced before being set. (Unlike in traditional BASIC, variables are not assumed to be 0 if never referenced, and they aren't cleared when run is encountered).
Functions are invoked with
( f u n c t i o n n a m e ) ( ( p a r a m ) , ( p a r a m ) , . . . ) The parameters are implicitly-delimited expressions that are passed to \matheval (which is simply called \eval in the table below to save space). The following functions are defined: l en ( s t r ing) Returns number of characters
in the string. chr$ (expr) Returns the character with the
given ascii value. inc (expr) Returns \eval(expr) + 1. min(expr1, expr2) Returns the lesser of two \evals.
Generalization
The BASIX interpreter can easily be generalized t o serve other needs. These other needs might be to interpret Lisp or POSTSCRIPT code [Anyone want to write a POSTSCRIPT interpreter in m?];
Summary
Using a number of TEX tricks, some more devious than others, a BASIC interpreter can be written in QX. While QX macros will often be less efficient than, for example, auk paired with TEX, solutions using only will be more portable. A less general macro package than BASIX could be written that uses these routines as paradigms and that is very efficient at parsing a specific input format.
TUGboat, Volume 11 (1990), No. 3 -Proceedings of the 1990 Annual Meeting
BASIX -An Interpreter Written in
Listing of basix . t e x 1.---basix.tex begins here.
2. % 3.% BaSiX (with the emphasis on SICK!) by Andrew Marc Greene
4. % 5. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%'rrL%'L%%'rrL%%%%'L%%%%'L%%%%%% 6. % 7. % Andrew's Affiliations
8. % 9. % Copyright (C) 1990 by Andrew Marc Greene
lo. % <amgreeneQmit . edu> 11. % MIT Project Athena 12. % Student Information Processing Board
13. % All rights reserved. 14. % 15. r:rr~'~'~:~:~r~'r~:~'~'~:rr~~r:rr~:r~:r~:r~'rr~:r~~~:r~:~:rrrr~;rrrr~;~;~:r~;rr~:r~ 16. % 17. % BaSiX's Beginnings 18. % 19. \def \f lageol~\catcode13=13}
20.\def\endflageol{\catcode13=5) 21. \def\struncat{\catcode1\$=12)
22. \def\strcat{\catcodel\$=ll}
23. \f lageol\let\eol 24. \endf lageol 25. \newif \ifresult
26. %\newcount \xa\neucount \xb
27. \def \iw(\immediate\uritel6} 28. \def \empty{) 29. \def \gobble#l{)
30. \def \spc{ )
31. \def\itstrue{tt} 32. \def \itsf alse{tf )
33. \def \isnull#l{\resultf alse
34.\expandafter\ifx\csname ernpty#l\endcsname\empty\resulttrue\fi} 35. \newcount \matha\newcount\mathb
36. % 37. %'l.%%%%%%%%%'A'rrL;/,%%%%%%%%%%%%%'L%'L%%%'L%%%%%%%%%%%%'L%'L% 38. % 39.X Character-string Calls
40. % 41. \newcount\strtmp
42.\def\ascii#1{\strtmp1#1)
43.\def\chr#1{\begingroup\uccode65=#l\uppercaseC\gdef\tmp{A)}\endgroup}
44. \def\strlen#l{\strtmp-2% don't count " " \iu tokens 45. \expandafter\if \stringP #l\let\next\strIter\strIter #l\iw\f i)
46. \def \strIter#l{\if x\iw#l\let\next\relax\else\advance\strtmp by l\relax 47. \f i\next}
48.\def\Flen{\expandafter\strlen\expandafter{\Pa}\retun{\nmber\strtmp}} 49. \strcat
50. \def \F$chr${\expandaf ter\chr\expandafter{\Pa}\return{\tmp}} 51. \struncat
52. % first char only: 53. % \def \Fasc{\expandafter\asc\expandaf ter(\)) 54. % 55. %%%%%'crrL%%%%%%%%%%'rrL%'L%%%%'rrX'L%'rrrL%%%%%'rrL%'L%%'L%'L%%'rL% 56. % 57. % Debugging Definitions
58. %
TUGboat, Volume 11 (1990), No. 3-Proceedings of the 1990 Annual Meeting
Andrew Marc Greene
64. %%%%%%%%%%%%%%%%'A%%'A%%%%%%'/A%%'~A%%#%%'L 65. % 66. % Expression Evaluation 67. % 68. \def\expression{\let\afterexpression\afterscan\math) 69. % 70.% (math is a misnomer and should -> expr) 71. % 72. \newcount\parens\newcount\mathParams 73. \def \math{\parens=O\mathParams96\mathInit\matheval)
74.\def\mathRecurse{\advance\parens by l\relax\mathParams96\mathInit\matheval) 75. \def \mathInit{\begingroup 76. \let\mathAcc\empty
77. \let\mathOpRel\empty 78. \let\mathOpAdd\empty
79. \let \mathOpMul\empty 80. \let\mathlhFlef \empty)
81. % 82. \def \matheval{\af ter\mathbranch\scan) 83. % 84.+def\mathbranch{\diw{EXPRESS:\expandafter\noexpand\word:) 85. \let \next \matherr 86.\ifx\empty\uord\let\next\mathHardEnd\else % Expr. end? 87.\expandafter\if\numberP\uord\let\next\matiteral\diw)\fi '/, Number? 88.\expandafter\if\stringP\word\let\next\mathLiteral\fi String literal?
89.\expandafter\if\identifierP\word\let\next\mathIdentifier\fi % Identifier? 90.\expandafter\if\stringvarP\word\let\next\mathIdentifier\fi % :-(
9~.\expandafter\if\macroP\word\let\next\mathMacro\fi % Macro? 9~.\ifx\word\0left\let\next\mat~ecurse\fi % Open paren? 93.\ifx\word\0right\let\next\mathEndRecurse\fi % Close paren? 94.\ifx\uord\0comma\let\next\mathComma\fi % Comma? 95. % 96. % Operator? 97. % 98. \if x\word\Oplus\let\next\math0p\diu{ ! +>\f i 99.\ifx\word\Ominus\let\next\mathOp\diu{!-)\fi
loo.\ifx\word\Otimes\let\next\mathOp\diu{!*)\fi lOl.\ifx\uord\Odiv\let\next\math0p\diw{!/)\fi lO2.\ifx\word\Olt\let\next\mathOp\diw{!<)\fi l03.\ifx\word\Oeq\let\next\math0p\did!=)\fi
l04.\ifx\word\Ogt\let\next\math0p\div(!>)\fi 105. % 106. \f i\next) 107. % 108. \def\Oleft{()\def \Oright{))\def \Ocomrna{,)
109.\def\~plus{+)\def\~minusC-)\def\0timesC+)\def\~div{/~ 110. \def \Olt{<)\def \OeqC=)\def \Ogt{>) 111. % 112.% There's got to be a better way to do the above .... 113. % 114.\def\math~iteral{\diw{MLIT)\ifx\empty\mathAcc\diw{AC~~:\word:) 115. \expandafter\def \expandafter\mathAcc\expandafter 116.~\expandafter\expandafter\expandafter\empty\word) 117. \else
118.\diw{ACC has :\mathAcc: and word is :\word:) 119.\errmessage{Syntax Error: Two values with no operator)\fi\matheval) 120. % 121.% Operator stuff: (Need to add string support / error checking) 122. % 123. \def \mathAdd{\advance\matha by \mathb)
124. \def \mathSub{\advance\matha by -\mathb)
125. \def\mathM~l(\multiply\matha by \mathb) 126. \def \matMiv{\divide\matha by \mathb)
127. \def\mathEQ{\ifnum\matha=\mathb\matha-l\else\mathaO\fi)
386 TUGboat, Volume 11 (1990), No. 3 -Proceedings of the 1990 Annual Meeting
BASIX-An Interpreter Written in TEX
128. \def \mathGT{\ifnum\matha>\mathb\matha-1\else\mathaO\f i) 129. \def\mathLT{\ifnum\matha<\mathb\matha-l\else\mathaO\fi)
130. % 131. \def \mathFlushRel{\mathFlushAdd\if x\empty\mathOpRel\else 132. \matha=\mathlhRel\relax\mathb=\mathAcc\relax\math0pRel 133. \edef\mathAcc{\number\matha~\let\mathOpRel\empty\fi)
134. % 135. \def\mathFlushAdd{\mathFlushMul\ifx\empty\math0pAdd\else
136. \matha=\mathlhAdd\relax\mathb=\mathAcc\relax\math0pAdd 137. \edef \mathAcc{\number\matha)\let\mathOpAdd\empty\f 1)
138. % 139.\def\mathFlushMul{\mathFlushRef\ifx\empty\math~pMu1\e1se 140. \matha=\mathlhMul\relax\mathb=\mathAcc\relax\mathOpMul 141. \edef\mathAcc~\number\matha~\let\math0pMul\empty\fi) 142. % 143. \def\mathFlushRef{\ifx\empty\mathlhRef\else 144. \mathParam
145. \mathlhRef \let\mathlhRef \empty\f i)
146. % 147. \def \mathop{%
148. \if \word+
149. \mathFlushAdd\let\mathlhAdd\mathAcc\let\mathOpAdd\mathAdd\fi
150. \if \word-
151. \mathFlushAdd\let\mathlhAdd\mathAcc\let\mathOpAdd\mathSub\fi 152. \if \word*
153. \mathFlushM~l\let\mathlhMul\mathAcc\let\mathOpMul\mathMul\fi 154. \if \word/
155. \mathFlu~hM~l\let\mathlhMul\mathAcc\let\mathOpMul\mathDiv\fi 156. \if \word=
157. \mathFlu~hRel\let\mathlhRel\mathAcc\let\mathOpRel\mathE~\f i 158. \if \word>
159. \mathFlushRel\let\mathlhRel\mathAcc\let \mathOpRel\mathGT\f i 160. \if \word<
161. \mathFlushRel\let\mathlhRel\mathAcc\let\mathOpRel\mathLT\fi 162. \let\mathAcc\empty
163. \matheval)
164. % 165. \def \mathIdentif ier{%
166.\expandafter\ifx\csname C\word\endcsname\relax
167. \expandaf ter\if x\csname F\word\endcsname\relax
168.\expandafter\ifx\csname V\word\endcsname\relax
169. \let\next\matherr\diw{LOSING:\word:) 170. \else\let\next\mathVariable\f i 171. \else\let\next\mathFunction\f i 172. \else\let\next\mathCornmand\f i\next)
173. % 174.\def\mathVariable{\expandafter\edef\expandafter\word\expandafter 175. {\csname V\word\endcsname)\mathbranch)
176. \def \mathCommand{\expandaf ter\mathHardEnd\word)
177.\def\mathFunction{\expandafter\let\expand~ter\mathlhRef
178. \csname F\word\endcsname\matheval) 179. % 180. \def \mathParam{\advance\mathParams by l\relax\chr\mathParams 181. \diw{PARAM: \tmp: \mathAcc :)
182. \expandaf ter\edef \csname P\tmp\endcsname{\mathAcc)) 183.\def\mathComma{\math~nd\mathParam\mathInit\matheva1)
184. \def\mathEndRecurse{\mathEnd\advance\parens by -l\matheval) 185. \def\mathEnd{\diw{MATHEND: ACC=\mathAcc:)\mathFlushRel 186. \xdef\mathtemp{\mathAcc)\endgroup\edef\mathAcc{\mathtemp))
187. \def \mathHardEnd{\ifnum\parens>O\errmessage{Insuf f i c e closeparens. )\relax
188. \let\next\endeval\else\let\next\mathFinal\fi\next)
189. \def\mathFinal{\mathEnd\let\value\mathAcc\endexpression) 190.\def\matherr{\errmessage~tax error: Unknown symbol \word))
191.\def\endexpression{\afterexpression)
TUGboat, Volume 11 (1990), No. 3 -Proceedings of the 1990 Annual Meeting
Andrew Marc Greene
192. % 193. % ' ~ % % ' ~ % % r ~ ~ ~ % ' ~ % % % ' ~ X ' ~ ~ % ' X ' ~ % ' X ' ~ % % ' ~ ~ ~ ~ % % ' X ' X ' ~ ~ % ' ~ ~ ~ % ' X ' X ' ~ ~ ~ ~ X ' ~ ~ % % % ' ~ ~ % 194. % 195. % Linked List 196. % 197.\def\gotofirstline{\edef\lpointer{\csname LO\endcsname)}
198.\def\foreachline#1{\ifnum\lpointer~99999\edef\word{\lpointer)#~% 199.\edef\lpointer{\csname L\word\endcsname)\foreachline{#l)\fi) 200. % 20l.% gotopast{#l) where #1 is a line number, will set \lpointer to 202. % the least value such that L(lpointer)>#l 203. % 204. \def \gotopast#i{\def \lpointer{O)\def \target{#l)\gotopastloop) 205. % 206. \def \gotopastloop{\edef \tmp{\csname L\lpointer\endcsname)% 207. \ifnum\tmp<\target%
208.\edef\lpointer{\csname L\lpointer\endcsname)%
209.\~et\next=\gotopastloop\else\let\next=\relax\fi 210. \next) 211. % 212. \flageol\def\addLineToLinkedList#l#2 213. {\def#i{#2)\diw{Just stored #2 in \noexpand #I)%
214. % now put it into linked list. . . 215.\expandafter\ifx\csname L\uord\endcsname\relax% if it isn't already there,
216.\gotopast{\word)% \def\lpointer{what-should-point-to-word) 217. \expandafter\edef \csname L\word\endcsname{\csname L\lpointer\endcsname)%
218. \expandaf ter\edef \csname L\lpointer\endcsname{\uord)% 219. \f i\endeval
220. )\endf lageol 221.\expandafter\def\csname LO\endcsname{99999) 222. % 223. %X%%'lh%%'lA%%#%%%%%%'A%%%X%%%%% 224. ;!
225. % Program Parser 226. % 227.\def\evalline{%\iw{EVALLINE :\word:)%
228. \csname C\uord\endcsname) %error-checking? :-)
229. \def\evalerrorC\errmessage{Unkonwn command. Sorry.)) 230. % 231. % \mandatory takes one argument and checks to see if the next 232.X non-whitespace token matches it. If not, an error is generated. 233. % 234. \def \mandatory#l{\def \tmp{#l)\mandatest) 235.\def\mandatest#i{\def\tmpp{#i)\ifx\tmp\tmpp\~et\next\afterscan\e~se
236. \let\next\manderror\f i\next)
237.\def\manderror(\errmessage{\tmpp\spc read when \tmp\spc expected.)% 238. \af terscan) 239. % 240.X \parseline gets the first WORD of the next line. If it's a line 241.X number, \scanandstoreline is called; otherwise the line is executed. 242. % 243. \def \parseline{\after\f irsttest\scan)
244. \def \f irsttest{\expandaf ter\if \numberP\word 245. \let\next\grabandstoreline\else\let\next\evalline\fi\next)
246. \def\grabandstorelineC\diw{Grabbing line \word.)% 247.\expandafter\addLineToLinkedList\csname/\word\endcsname}
248. % 249. %%%%;l,#%%%'l/lA%'llA%%%'lA%'A%%'A%'A%%'A%%%%%'lA%'h%'A%'llA%%%%%%'A%%% 250. % 251. % Syntactic Scanner 252. % 253.X The \scan routine reads the next WORD and then calls \afterscan. 254. ;!
255. % As syntactic sugar, one can write \after\foo to set \afterscan to
388 TUGboat, Volume 11 (1990), No. 3-Proceedings of the 1990 Annual Meeting
BASIX - An Interpreter Written in
256. % \f00.
257. % 258.1 Here are the rules governing WORD. Initial whitespace is
259.X discarded. The word is the next single character, unless that
260. % character is one of the following:
261. % 262. % A-Z or a-z : [A-Z, a-z] [A-Z , a-z ,0-a*\$? 263. % 0-9: [O-9]+
264. % " : $1 [-,,I * t ,
265. % <,=,>: [<=>I [<=>I? (one or two; not the same if two)
266. % 267. % Note that the string literal ignores spaces but may be abnormally
268. % terminated by an end-of-line. (I wasn't sure how to express that 269. % as a regexp) . 270. % 271. % 272. \newif \if scan j: shall we continue scanning?
273. % 274.\def\scan{\def\~ord()\futurelet\~\scanFirst~
275. % 276.\def\scanFirstC% Checks the first character to determine type. 277. \let\next\scanIter
278. \expandafter\if \spc\noexpand\q % Space -- ignore it 279. \let\next\scanSpace\else
280. \if \eol\noexpand\q % End of line -- no word here 281. \let\next\scanEnd\else 282. \if cat A\noexpand\q % Then we have an identifier 283. \let\scanTest\scanIdentifier\else
284. \expandaf ter\if \digit\q 285. \let\scanTest\scanNumericConstant\else
286. \if It\noexpand\q 287. \let\scanTest\scanStringConstant\else
288. \expandaf ter\if \relationP\q 289. \let\scanTest\scanRelation
290. \else 291. \let\scanTest\scanf alse
292. \f i\f i\f i\f i\f i\f i\next}
293. % 294.\def\scanIter#~{\expandafter\def\expandafter\word\expdafter\word #I}
295. \futurelet\q\scanContinuePl 296. \def\scanContinueP{\scanTest\ifscan\let\next\scanIter 297. \else\let\next\scanEnd\fi\next}
298. % 299.\def\scanSpace#l{\scan)% If the first char is a space, gobble it and try again.
300. \def \scanIdentif ierf\if cat A\noexpand\q\scantrue\else 301. \expandafter\if \digit\q\scantrue
302. \else\if $\noexpand\q\scantrue
303. \expandaf ter\def \expandaf ter\word\expandaf ter{\expandaf ter$\word}
304. \let\scanTest\scanf alse\else 305. \scanf alse\f i\f i\f i)
306. \def \scanEndStringf\scanf alse} 307.\def\scdumericConstantI\expandafter\if\digit\q\scantrue\e~se\scanfa~se\fi~
308.\def\scanStringConstant{\scantrue\if"\q\let\scanTest\scanfalse\fi~ 309.\def\sc~elation{\if<\q\scantrue\else\if>\q\scantrue\else\if=\q\scantrue
310. \else\scanf alse\f i\f i\f i} 311. % 3 1 2 . \ d e f \ ~ ~ ~ n d # 1 { \ ~ e l a ~ \ d i ~ { S c A N N E D : \ W 0 r d : }
313. \af terscan #I}% dumps trailing spaces. 314. \def \af ter{\let\af terscafl
315. % 316. %#%%%X%%X%%%%%rA%'A%%%'A%%'b%'A%%'/A%'A%%%%%%%%rk%%r/A%%'L%%'A%%%'/A 317. % 318. % Type Tests (Predicates for type determination) 319. %
TUGboat, Volume 11 (1990), No. 3-Proceedings of the 1990 Annual Meeting
Andrew Marc Greene
32o.\def\relationP#l{tf) % for now, only single-char relations 321. \def \identif ierP#l{\expandaf ter\ident;if ierTest #I\\) 322,\def\identifierTest#l#2\\{\ifcat A#l\itstrue\else\itsfalse\fi) 323. \def \stringvarP#l{\expandaf ter\stringvarTest #I\\)
324.\def\stringvarTest#l#2\\C\if$#l\itstrue\e~se\itsfa1se\fi)
325. \def \stringP#l{\expandafter\stringTest #I\\) 326.\def\stringTest#l#2\\i\if #l"\itstrue\else\itsfalse\fi)
327.\def\numberP#l{\expandafter\numberTest #I\\) 328.\def\numberTest#l#2\\{\expandafter\if\digit #l\itstrue\else\itsfalse\fi)
329. \def \macroP#l{\expandaf ter\macroTest #I\\)
33O.\def\macroTest#~#2\\{\expandafter\ifx #l\relax\itstrue\else\itsfalse\fi)
331. % 332. % \digit tests its single-token argument and returns tt if true,
333. % tf otherwise.
334. % 335. % 336. \def \digit#1{%
337. \if O\noexpand#l\itstrue\else
338. \if l\noexpand#l\itstrue\else 339. \if 2\noexpand#l\itstrue\else
340. \if 3\noexpand#l\it strue\else
341 \if4\noexpand#l\itstrue\else
342. \if 5\noexpand#l\itstrue\else 343. \if 6\noexpand#l\itstrue\else 344. \if 7\noexpand#l\itstrue\else 345. \if 8\noexpand#l\itstrue\else
346.\if9\noexpand#l\itstrue\else\itsfalse 3 4 7 . \ f i \ f i \ f i \ f i \ f i \ f i \ f i \ f i \ f i \ f i
348.%9 8 7 6 5 4 3 2 1 0
349. )
350. % 351. %'l/;/:l/;ll~lA'/;/;/l/;//:/l/1(~ll/;/;/;l~A'/;/:l/;l//;/:/;l/;/~~~ll/;A'l/:~//:/;lll/;~l/:l/;/;/e 352. % 353. % User Utilities. These are the commands that are called by the
354.X user. We could really use a better section name. :-I 355. % 356 % List (one line or all lines, for now)
357. % 358. \def \Clist{\af ter\listmain\scad
359 \def\listmain{\isnul1I\word~\ifresult\let\next\listalllines 360.\else\let\next\listoneline\fi\next)
361. \def\listline{\iw{\word\spc\csname/\word\endcsname)~ 362.\def\listall~ines{\gotofirstline\foreach~ine{\~ist~ine)\endeva~)
363. \def\listoneline{\listline\endeval) 364. % 365. %%%%%%%'lA%%%%%%%'A%'lA%%%'A%%'lA'A%%'lA%'A%%%%%%%'A%%%'/A%'/A%%'A%'L%%%'/llA%%%% 366. % 367.X Different degrees of "stop execution"
368. % 369. \def \Csystem{\end) % exits to the system 370.\def\Cexit{)% \endflageol) % exits to TeX 371. \f lageoS/,
372. \def \Cstop#l
373. {\iu{Stopped in \lineno.)\cleanstop)%
374.\def\cleanstop{\diw~LEANSTOP)\let\endeva1\enduser1ine\endeva1
375. )\endf lageol
376. % 377. %'lA%%%%%%X%X%%%%%%%%'lA%'A%%%'A%%'A%%'A%%'A%%'tA%'tA%%'lA%%%%%'A%'A%%'A%'A%%'A%'lA 378. % 379. % The command "rem" introduces a remark
380. % 381. \def \Crem{\endeval)%
382. % 383. %'l~%%%%'A'~A%%'h' tA%'~%%'A%%%'l~X'A%%%%'l~%'~~%%'A%'t lA%'~lA%%%%'A%%'l~%' lA%%'~%%'t~A
390 TUGboat, Volume 11 (1990), No. 3 -Proceedings of the 1990 Annual Meeting
BASIX - An Interpreter Written in T@
384. % 385.X The "let" command allows variable assignments
386. % 387. \def\Clet{\after\letgetequals\scan) 388. \def\letgetequals{\after\letgetvalue\mandatory{=))
389. \def \letgetvalue{\after\letdoit\expression)
390. \def \letdoit{\expandaf ter\edef \csname V\word\endcsname{\value)% 391. \endeval)
392. % 393. %'h'A'lllr/:A'/:ll/:lll/:lA'rl/:/:l/:lA'/:lh'll/:rlr/;l/:/:/I%'/:lr/;A'/:/:l/:ltlr/:A'A'l/:/:l/:l/:l/, 394. % 395 % The "print" command takes a [list of] expression[s] and displays
396 % it [them].
397. % 398. \def\Cprint{\after\printit\expression) 399.\def\printit{\iw{\value}\endeval)
400. % 401. %X%%%%%%%'A%%X%%;I,%%%%%%%%%%%%%%'A%'cA%% 402. % 403. % The "if " command takes an expression, the word "then," and
404. % another command. If the expression is non-zero, the command is
405.% executed; otherwise it is ignored.
406. % 407.\def\Cif{\after\getift\expression)
408.\def\Cthen{\errmessage{Syntax error: THEN without IF)) 409. \def \getif t{\after\getifh\mandatory t)
410. \def \getifhi\after\getif e\mandatory h)
411. \def\getife{\after\getifn\mandatory e)
412. \def\getifn{\after\consequent\mandatory n)
413. \def \consequent{\ifnum\value=0\let\next=\endeval\else\let\next=\evalconsq\f i 414. \next)
415. \def\evalconsq~\after\evalline\sca~
416. % 417. %%%%%%X%%%;/,%%%%%'rX'A%'/A%%'A%%%%%%'A%%-lA%%%%%%'/A%'A% 418. % 419. % Functions
420. % 421.% Functions may read the counter \mathparams to find out the number
422. % of the top parameter. Parameters are in Pa Pb PC etc. 423. % 424.\def\return{\expandafter\def\expandafter\mathAcc\expandafter) 425. % 426. \def \Finc{\matha=\Pa \advance\matha by 1
427. \return{\{\mnumber\matha)) 428. % 429. \def\Fmin{\ifnum\Pa<\Pb\return{\Pa}\else\return{\Pb)\fi) 430. % 431. %%%%'lA%%X%;/,%%%%%j/,%%%'A'A%%%%%%%%%%%%'!I%%%%'A%%%'A%%'h%%%%%%%%%%%% 432. % 433. % Program execution control
434. % 435.\def\Crun~\let\endeval\endincrline\def\linen0{0)\endeva1) 436. \def\Cgoto{\let\endeval\endgotoline\after\gotomain\scan)
437. \def \gotomain{\edef \lineno{\word)\endeval)
438. % 439. \f lageol%
440. \def\execline~%\message~Executing line \lineno...)% 441. \edef\theline{\csname/\lineno\endcsname)%
442. %\message{THE LINE\theline)%
443. \let\endeval\endincrline\after\evalline\expandaf ter\scan\theline 444. )\endf lageol
445. % 446. % Different varieties of what to do at the end of a command:
447. %
TUGboat, Volume 11 (1990), No. 3-Proceedings of the 1990 Annual Meeting
Andrew Marc Greene
448. % get new line from user (enduserline)
449.X get next line in order (endincrline) 450. % get line in \lineno (endgotoline) 451. % keep parsing current line (endcolonline tbi)
452. % 453. \f lageol% 454. \def \endincrline#l
455. {\diw{ENDINCRLINE)\edef \lineno{\csname L\lineno\endcsname)\execnextline~ 456. % 457. \def \endgotoline#l
458. {\diw{ENDGOTOLINE)\1et\endeval\execincrline\execnextline)% 459. % 460. \def\execnextline(\diw(Ready to execute line \lineno...)%
461. \ifnum\~ineno<99999\~et\next\exec~ine\e~se\~et\next\c~e~stop\fi\next~%
462. % 463. \def \enduserline #1 464. {\diw{ENDUSERLINE)\parseline)\endf lageol
465. % 466. \let\endeval\enduserline
467. % 468. %'A%%%%%%%%%%%%%'/A%'l/A%%%%%%%%%%'/A%'rA%%%'h%%'/A%'rX'A%%'r/rA%%%'rA%%%'A% 469. % 470. % Start your engines! 471. % 472.\iw{This is BaSiX, v0.3, emphasis on the SICK! by amgreeneQmit.edu)
473. \f lageol 474. \catcode32=12
475. \endeval
476. 477.---basix.tex ends here. The blank line at the end is significant.
TUGboat, Volume 11 (1990), No. 3 -Proceedings of the 1990 Annual Meeting