sandesh kangondi bassem abuein. introduction a regular expression parser basically parses the...
TRANSCRIPT
AN IMPLEMENTATION OF A REGULAR EXPRESSION PARSER
SANDESH KANGONDI BASSEM ABUEIN
INTRODUCTIONA regular expression parser basically parses
the regular expression in the following steps.Takes as input the regular expression.Converts the input regular expression to an
NFA.Converts the NFA obtained to a DFA.Finally converted to a minimum DFA.Goal:implementation of a Regular Expression
parser .
Specification Has a GUI to understand the states and transitions Use of ^ and $ tokens to specify match at the beginning and
ending of the pattern respectively. A C# implementation – object oriented. Has a feature allowing for the control the greediness of the
parser - allowing you to experience the different behavior of greediness.
Eg: When Greediness is set to false. An expression "a_*p" in string "appleandpotato"- will match "ap" and not "appleandp".
FEATURES
APPLICATIONSApplications of parsing include everything
from simple phrase finding, for proper name recognition to full semantic analysis of text, e.g. for information extraction or machine translation.
DESIGN
Class diagram for the parser
DESIGNThe Set class is a simple representation of a Set
in mathematics. The Map class is a map between a key and one
or more objects. The State class holds the data structure of the
automata. RegEx - main class that actually uses other
classes. The RegExValidator class is used to validate a
pattern string. Validation done using Recursive Descent
Parsing. Besides validating the pattern, it does two other
tasks: insertion of implicit tokens making it explicit and expanding character classes.
RECURSIVE DESCENT PARSERA recursive descent parser is a top-down
parser built from a set of mutually-recursive procedures (or a non-recursive equivalent) where each such procedure usually implements one of the production rules of the grammar.
structure of the resulting program closely mirrors that of the grammar it recognizes.
RECURSIVE DESCENT PARSEROne easy way to do recursive descent parsing is to have
each parse method take the tokens it needs, build a parse tree, and put the parse tree on a global stackWrite a parse method for each nonterminal in the
grammarEach parse method should get the tokens it needs,
and only those tokensThose tokens (usually) go on the stack
Each parse method may call other parse methods, and expect those methods to leave their results on the stack
Each (successful) parse method should leave one result on the stack
Running and TESTINGThe following slide show a sansnapshot of
running program.The input is regular expression : a_*pThe output as you see.Note that at the left side of this snapshot we
can search string for specific regular expression pattern.
Running and TESTING
ReferencesMichael Sipser. Introduction to the Theory of
Computation, Second Edition. 1996 Cambridge, Massachusetts."Discrete Mathematics and Its Applications" -
Kenneth H. Rosen (Fourth Edition) "Compilers - Principles, Techniques and Tools" -
Aho, Sethi and Ullmanhttps://intraweb.wvutech.edu/~mclark/Introduction to Automata Theory, Languages and
Computation, John E. Hopcroft, Rajeev Motwani and Jeffrey D. Ullman, 2nd edition,