ll parser generator assignment
DESCRIPTION
LL Parser Generator AssignmentTRANSCRIPT
![Page 1: LL Parser Generator Assignment](https://reader035.vdocuments.site/reader035/viewer/2022081802/563db7e4550346aa9a8eed83/html5/thumbnails/1.jpg)
CS440 Programming Languages -Project 1
Page 1 of 4
Scheme: LL Parser GeneratorIn our unit on syntax analysis we’ve learned how LL(1) PREDICT sets are constructedfrom FIRST and FOLLOW sets. In the current project you will build, in a purely functionalsubset of Scheme, a parser generator that implements these constructions.
To get you started, I’m providing a 300-line skeleton file. You will want to study the code inthis file carefully.
The key function you are to implement is the following:
(define parse-table(lambda (grammar)
;;; your code here; my version is about 15 lines long,;;; (but it calls other functions described below)
))The input grammar must consist of a list of lists, one per non-terminal in the grammar. The firstelement of each sub-list should be the non-terminal; the remaining elements should be theright-hand sides of the productions for which that non-terminal is the left-hand side. The sub-list for the start symbol must come first. Every grammar symbol must be represented as aquoted string. As an example, here is our familiar LL(1) calculator grammar in the requiredformat:
(define calc-gram'(("P" ("SL" "$$"))("SL" ("S" "SL") ())("S" ("id" ":=" "E") ("read" "id") ("write" "E"))("E" ("T" "TT"))("T" ("F" "FT"))("TT" ("ao" "T" "TT") ())("FT" ("mo" "F" "FT") ())("ao" ("+") ("-"))("mo" ("*") ("/"))("F" ("id") ("num") ("(" "E" ")"))))
The parse table, as returned by function parse-table, must have the same format, except thatevery right-hand side is replaced by a pair (a 2-element list) whose first element is the predictset for the corresponding production, and whose second element is the right-hand side. If youtype
(parse-table calc-gram)
the Scheme interpreter should respond(("P" (("$$" "id" "read" "write") ("SL" "$$")))("SL" (("id" "read" "write") ("S" "SL")) (("$$") ()))("S"(("id") ("id" ":=" "E"))(("read") ("read" "id"))(("write") ("write" "E")))("E" (("(" "id" "num") ("T" "TT")))("T" (("(" "id" "num") ("F" "FT")))("TT" (("+" "-") ("ao" "T" "TT")) (("$$" ")" "id" "read" "write") ()))
![Page 2: LL Parser Generator Assignment](https://reader035.vdocuments.site/reader035/viewer/2022081802/563db7e4550346aa9a8eed83/html5/thumbnails/2.jpg)
CS440 Programming Languages -Project 1
Page 2 of 4
("FT" (("*" "/") ("mo" "F" "FT")) (("$$" ")" "+" "-" "id" "read""write") ()))("ao" (("+") ("+")) (("-") ("-")))("mo" (("*") ("*")) (("/") ("/")))("F" (("id") ("id")) (("num") ("num")) (("(") ("(" "E" ")"))))
A parse function is provided that accepts a grammar and an input string as arguments. It callsthe parse-table function and then uses it to parse the input, printing a trace of its actions asit does so, in a manner reminiscent of the –Dparse output from the PL/0 compiler. You canuse this function to test your code.
A possible implementation strategy
There are many ways to implement parse-table. Feel free to choose whatever strategy appealsto you. If you’re not sure where to start, there is a skeleton of a few routines that may providesome guidance. These don’t necessarily embody the best strategy.This code uses two main data structures: a “right context” structure and a “knowledge”structure. A right-context function is provided to generate the former for any givensymbol B. The function returns a list of pairs. Each pair consists of a symbol A and a list ofsymbols β such that for some α, A → α B β. As an example, if you type
(right-context "SL" calc-gram)
the Scheme interpreter should respond
(("P" ("$$")) ("SL" ()))
This tells us that SL appears on the right-hand of two productions in the grammar: one with P onthe left-hand side and one with SL on the left-hand side. In the former, the portion of the right-hand side after the SL is $$. In the latter, the portion of the right-hand side after the SL is empty(that is, SL is the last thing on the right-hand side). In a similar vein, if you type
(right-context "mo" calc-gram)
the Scheme interpreter should respond
(("FT" ("F" "FT")))
This tells us there is only one production with a mo on the right-hand side. It has FT on the left-hand side, and F FT after the mo on the right-hand side.
The right-context information is useful for constructing FOLLOW sets.
Assuming you use the suggested strategy, you will need to compute the “knowledge” structurerecursively. This structure consists of a list of 4-element sub-lists, one per non-terminal. Eachsub-list contains (1) the non-terminal itself (call it A), (2) a Boolean indicating whether wecurrently think that A →* ε, (3) our current estimate of FIRST(A) − {ε}, and (4) our currentestimate of FOLLOW(A) − {ε}. It is much easier in to keep track of ε separately, rather than toinclude it in the FIRST and FOLLOW sets.
The function to generate the knowledge structure is
(define get-knowledge
![Page 3: LL Parser Generator Assignment](https://reader035.vdocuments.site/reader035/viewer/2022081802/563db7e4550346aa9a8eed83/html5/thumbnails/3.jpg)
CS440 Programming Languages -Project 1
Page 3 of 4
(lambda (grammar);;; your code here; my version is a little under 30 lines
))
If you type
(get-knowledge calc-gram)
the interpreter should respond
(("P" #f ("$$" "id" "read" "write") ())("SL" #t ("id" "read" "write") ("$$"))("S" #f ("id" "read" "write") ("$$" "id" "read" "write"))("E" #f ("(" "id" "num") ("$$" ")" "id" "read" "write"))("T" #f ("(" "id" "num") ("$$" ")" "+" "-" "id" "read" "write"))("TT" #t ("+" "-") ("$$" ")" "id" "read" "write"))("FT" #t ("*" "/") ("$$" ")" "+" "-" "id" "read" "write"))("ao" #f ("+" "-") ("(" "id" "num"))("mo" #f ("*" "/") ("(" "id" "num"))("F" #f ("(" "id" "num") ("$$" ")" "*" "+" "-" "/" "id" "read" "write")))
This tells us, for example, that FT generates epsilon, but F does not, andthat FOLLOW(mo) = {(, id, num}.
As the base of its recursion, get-knowledge uses an initial, empty structure generated byfunction initial-knowledge, which is provided. At each step of the recursion the function makesuse of utility routines that extract information from the current structure:
(define generates-epsilon?(lambda (w knowledge grammar)
;;; your code here; my version is 7 lines long))
(define first(lambda (w knowledge grammar)
;;; your code here; my version is 10 lines long))
(define follow(lambda (A knowledge)
(cadddr (symbol-knowledge A knowledge)))); This is simpler than the other two functions, because it only needs; to work for individual non-terminals, not for lists of symbols.
If you work in pairs on this assignment, one possible division of labor is for one partner towrite generates-epsilon?, first, and parse-table, while the other partner writes get-
knowledge. A better strategy, however, may be to start by having one partner write generates-
epsilon? while the other writes first. Then sit down together and write get-
knowledge and parse-table. This is one of those assignments where two heads may work betterthan one.
Important: you are required to use only the functional features of Scheme; functions with anexclamation point in their names (e.g. set!) and input/output mechanisms other than load andthe regular read-eval-print loop are not allowed. (You may find imperative features useful fordebugging. That’s ok, but get them out of your code before you hand anything in.)
![Page 4: LL Parser Generator Assignment](https://reader035.vdocuments.site/reader035/viewer/2022081802/563db7e4550346aa9a8eed83/html5/thumbnails/4.jpg)
CS440 Programming Languages -Project 1
Page 4 of 4
Extra Credit suggestions
1. Modify your parse-table function to print a helpful error message if the inputgrammar is not LL(1).
2. Modify your parse-table function to print warning messages if thereare useless symbols in the grammar: symbols that can’t appear in any validsentential form (i.e. in any derivation of a string of terminals from the startsymbol).
3. Modify the parse function so that it builds and then displays the parse tree.4. (Hard) Implement syntactic error recovery.
Quiz 2
Before the beginning of the next class, finish the quiz 2 on Moodle by answering thefollowing questions:
1. What does the following code do? Explain your answer.
(apply * (map + '(1 2 3) '(4 5 6) '(7 8 9)))
2. The sort routine in the skeleton file implements a simple version of the classicquicksort algorithm. Which element does this version use as a pivot (the valuearound which to partition the list)?
3. When you open a program in DrScheme/Racket, what color does it use todisplay quoted character strings?