crafting an interpreter part 2 - the calc0 toy language - vrosnet - codeproject

Upload: antony-ingram

Post on 03-Jun-2018

216 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/12/2019 Crafting an Interpreter Part 2 - The Calc0 Toy Language - VROSNET - CodeProject

    1/7

    ng an interpreter Part 2 - The Calc0 Toy Language - VROSNET - CodeProject

    /workspaces.codeproject.com/vrosnet/crafting-an-interpreter-part-2-the-calc0-toy-langu[19/06/2014 4:44:44 AM]

    ome VROSNET Crafting an interpreter Part 2 - The Calc0 Toy Language

    Workspace home

    Co

    Ta

    Arti

    Add a

    Crafting an interpreter Part 2 - The Calc0Toy Languageby: VROSNETforked from: martinholzherr/ crafting-an-interpreter-part-2-the-calc0-toy-langu

    Second article on building a language interpreter describing error handling and direct evaluation during parsing.(Back to article)

    0

    Share with others

    Congratulations! Your download is complete

    You now have a new workspace in the cloud!

    Your downloaded code is ready to be worked on and shared. You can now access your

    downloaded code from anywhere at any time.

    What can I do with this Workspace?

    Connect directly via Visual Studio

    Make changes to the code without affecting the authors original

    Issue a pull request to the author to alert them of fixes or updates youve made

    View your workspace

    This workspace will provide you a snapshot of

    the download. If the author updates or

    changes the code (uploads zip or push it over

    git) then you can update this snapshot easily

    or keep your version locked to ensure stability.

    Learn More

    Check out our selection of articles to get you

    started!

    Working with ::GitMachine

    Introduction to ::Tasks

    Introduction to git for Windows users

    Migrating a SVN Repository to Git using

    ::GitMachine

    Customizing ::Tasks - tips and tricks

    CREATE AWORKSPACE

    EXPLOREWORKSPACES

    My favorites

    Nothing to display

    My workspaces

    Nothing to display

    Shared with me

    Nothing to display

    WCode WTasks

    Forum

    VROSNET Sign outhome workspaces premium account

    Submit

    https://workspaces.codeproject.com/https://workspaces.codeproject.com/vrosnethttps://workspaces.codeproject.com/vrosnet/crafting-an-interpreter-part-2-the-calc0-toy-langu/codehttps://workspaces.codeproject.com/vrosnet/crafting-an-interpreter-part-2-the-calc0-toy-langu/taskshttps://workspaces.codeproject.com/vrosnet/crafting-an-interpreter-part-2-the-calc0-toy-langu/articlehttps://workspaces.codeproject.com/vrosnet/crafting-an-interpreter-part-2-the-calc0-toy-langu/storehttps://workspaces.codeproject.com/vrosnet/crafting-an-interpreter-part-2-the-calc0-toy-langu/storehttps://workspaces.codeproject.com/vrosnethttps://workspaces.codeproject.com/martinholzherrhttps://workspaces.codeproject.com/martinholzherr/crafting-an-interpreter-part-2-the-calc0-toy-languhttp://www.codeproject.com/Articles/10142/Crafting-an-interpreter-Part-The-Calc-Toy-Languhttps://workspaces.codeproject.com/vrosnet/crafting-an-interpreter-part-2-the-calc0-toy-langu/sharehttp://www.codeproject.com/Articles/738506/Working-with-CodeProject-Workspaces-Code-GitMachinhttp://www.codeproject.com/Articles/746086/An-Introduction-to-Workspaces-Taskshttp://www.codeproject.com/Articles/754569/An-Introduction-to-Workspaces-Git-for-Windows-userhttp://www.codeproject.com/Articles/781740/Migrating-a-Non-standard-SVN-Repository-to-Git-usihttp://www.codeproject.com/Articles/781740/Migrating-a-Non-standard-SVN-Repository-to-Git-usihttp://www.codeproject.com/Messages/4773105/Customising-the-Tasks-application-Tips-and-Tricks.aspxhttps://workspaces.codeproject.com/vrosnet/new_workspacehttps://workspaces.codeproject.com/vrosnet/new_workspacehttps://workspaces.codeproject.com/explorehttps://workspaces.codeproject.com/explorehttps://workspaces.codeproject.com/vrosnet/crafting-an-interpreter-part-2-the-calc0-toy-langu/codehttps://workspaces.codeproject.com/vrosnet/crafting-an-interpreter-part-2-the-calc0-toy-langu/taskshttp://www.codeproject.com/Forums/1854922/Workspaces-Forum.aspxhttps://workspaces.codeproject.com/user/logouthttps://workspaces.codeproject.com/vrosnethttps://workspaces.codeproject.com/vrosnethttps://workspaces.codeproject.com/user/logouthttps://workspaces.codeproject.com/user/logouthttp://www.codeproject.com/Forums/1854922/Workspaces-Forum.aspxhttp://www.codeproject.com/http://www.codeproject.com/https://workspaces.codeproject.com/vrosnet/crafting-an-interpreter-part-2-the-calc0-toy-langu/taskshttps://workspaces.codeproject.com/vrosnet/crafting-an-interpreter-part-2-the-calc0-toy-langu/codehttps://workspaces.codeproject.com/explorehttps://workspaces.codeproject.com/explorehttps://workspaces.codeproject.com/explorehttps://workspaces.codeproject.com/explorehttps://workspaces.codeproject.com/vrosnet/new_workspacehttps://workspaces.codeproject.com/vrosnet/new_workspacehttps://workspaces.codeproject.com/vrosnet/new_workspacehttps://workspaces.codeproject.com/vrosnet/new_workspacehttp://www.codeproject.com/Messages/4773105/Customising-the-Tasks-application-Tips-and-Tricks.aspxhttp://www.codeproject.com/Articles/781740/Migrating-a-Non-standard-SVN-Repository-to-Git-usihttp://www.codeproject.com/Articles/781740/Migrating-a-Non-standard-SVN-Repository-to-Git-usihttp://www.codeproject.com/Articles/754569/An-Introduction-to-Workspaces-Git-for-Windows-userhttp://www.codeproject.com/Articles/746086/An-Introduction-to-Workspaces-Taskshttp://www.codeproject.com/Articles/738506/Working-with-CodeProject-Workspaces-Code-GitMachinhttps://workspaces.codeproject.com/vrosnet/crafting-an-interpreter-part-2-the-calc0-toy-langu/sharehttps://workspaces.codeproject.com/vrosnet/crafting-an-interpreter-part-2-the-calc0-toy-langu/sharehttps://workspaces.codeproject.com/vrosnet/crafting-an-interpreter-part-2-the-calc0-toy-langu/edithttp://www.codeproject.com/Articles/10142/Crafting-an-interpreter-Part-The-Calc-Toy-Languhttps://workspaces.codeproject.com/martinholzherr/crafting-an-interpreter-part-2-the-calc0-toy-languhttps://workspaces.codeproject.com/martinholzherrhttps://workspaces.codeproject.com/vrosnethttps://workspaces.codeproject.com/vrosnet/crafting-an-interpreter-part-2-the-calc0-toy-langu/storehttps://workspaces.codeproject.com/vrosnet/crafting-an-interpreter-part-2-the-calc0-toy-langu/storehttps://workspaces.codeproject.com/vrosnet/crafting-an-interpreter-part-2-the-calc0-toy-langu/articlehttps://workspaces.codeproject.com/vrosnet/crafting-an-interpreter-part-2-the-calc0-toy-langu/taskshttps://workspaces.codeproject.com/vrosnet/crafting-an-interpreter-part-2-the-calc0-toy-langu/codehttps://workspaces.codeproject.com/vrosnethttps://workspaces.codeproject.com/http://www.codeproject.com/
  • 8/12/2019 Crafting an Interpreter Part 2 - The Calc0 Toy Language - VROSNET - CodeProject

    2/7

    ng an interpreter Part 2 - The Calc0 Toy Language - VROSNET - CodeProject

    /workspaces.codeproject.com/vrosnet/crafting-an-interpreter-part-2-the-calc0-toy-langu[19/06/2014 4:44:44 AM]

    Download sample - 107 Kb

    Download source (procedural implementation) - 9.71 KbDownload source (template implementation) - 10.4 Kb

    Introduction

    In this second article on building an interpreter, we discuss the implementation of an interpreter for a very simple

    language. We use a Parsing Expression Grammar [1] [2] to define the syntax and implement the parser. We show

    how evaluation can be done during parsing and how to extend the grammar, so that error messages can be issuedin case of parsing errors. We show furthermore, that a Grammar sometimes must be slightly changed in order to

    alleviate direct evaluation during parsing.

    The Calc0 toy language used for this purpose supports expressions, assignments, variables and print statements.

    Immediately after starting, the interpreter for Calc0 enters an infinite loop consisting of the following processing

    steps:

    It reads a line entered by the user.

    It parses the line, and during parsing, it evaluates the recognized expression, assignment or print statement.

    It updates an internal map of variable names and their associated values.

    It outputs the results of the evaluation.

    http://www.codeproject.com/KB/recipes/crafting_interpreter_p2/crafting_interpreterp2_sample.ziphttp://www.codeproject.com/KB/recipes/crafting_interpreter_p2/crafting_interpreterp2_src1.ziphttp://www.codeproject.com/KB/recipes/crafting_interpreter_p2/crafting_interpreterp2_src2.ziphttp://www.codeproject.com/KB/recipes/crafting_interpreter_p2/crafting_interpreterp2_src2.ziphttp://www.codeproject.com/KB/recipes/crafting_interpreter_p2/crafting_interpreterp2_src1.ziphttp://www.codeproject.com/KB/recipes/crafting_interpreter_p2/crafting_interpreterp2_sample.zip
  • 8/12/2019 Crafting an Interpreter Part 2 - The Calc0 Toy Language - VROSNET - CodeProject

    3/7

    ng an interpreter Part 2 - The Calc0 Toy Language - VROSNET - CodeProject

    /workspaces.codeproject.com/vrosnet/crafting-an-interpreter-part-2-the-calc0-toy-langu[19/06/2014 4:44:44 AM]

    The Calc0 Grammar

    The Calc0 Interpreter expects one line of input. The most basic elements on such a line are spaces, identifiers and

    numbers. We allow only the white space character, tabulators and carriage return as spaces. We choose C/C++

    identifiers and allow floating point or integer values as numbers. Expressions support addition, subtraction,

    multiplication, division and the power operator.

    We need a context free grammar which is compatible with the parsing method used. Since we chose the Parsing

    Expression Grammar formalism, which is an LL parsing method, we must avoid left recursive rules. This results in the

    following Parsing Expression Grammar:

    //first attempt to calc0_grammar

    calc0_grammar: S (print_variables / assignement / expression) S '\0';

    assignement: identifier S '=' S expression ;

    print_variables: '!' S (identifier (S '-' S identifier)?)? ;

    expression: multiplicative_expression

    (S ('+' / '-') S multiplicative_expression)* ;

    multiplicative_expression:

    power_expression (S ('*' / '/') S power_expression)* ;

    power_expression: primary_expression (S '^' S primary_expression)? ;

    primary_expression: ('+' / '-')? S

    (identifier / number / '(' S expression S ')') ;

    number: ([0-9]+ '.'? / '.'[0-9])

    [0-9]* (('e' / 'E') [0-9]+)? ;

    identifier: [A-Za-z_][A-Za-z_0-9]* ;

    S: [ \r\v\t]* ;

    The start rule of this grammar is called calc0_grammar. The PEG operational interpretation of this rule is the

    following: parse the input text by first matching against possible spaces, then try to match the remaining input

    against exactly one of the rules print_variables, assignementor expression(try in this order). If

    successful, continue parsing by skipping possible spaces, and finally expect the input line terminator which is a zero.

    The other rules can be interpreted similarly.

    Direct Evaluation

    Direct evaluation during parsing means that we carry out the arithmetic operations, the variable assignments and

    variable accesses at the place where we parse them. Additions and subtractions, e.g., have to be done inside the

    grammar function responsible for parsing an expression.

    expression: multiplicative_expression

    (S ('+'/ '-') S multiplicative_expression)* ;

    The evaluation of the value of expressionis done by adding/subtracting the result values of the components of

    this rule. Intermediate results are stored in the context structure passed to each grammar rule function. This is shown

    in the following table:

    rule component of expression code executed effect on evaluation

    expression:

    multiplicative_expression

    call multiplicative_expression.

    Then copy the result from the context

    structure to a local variable.

    doubleresult= pS->result;

    evaluation value of first rule

    component is in a local

    variable of the parsing

    function of expression

    (S ('+' / '-') S

    multiplicative_expression)*

    for each loop iteration, we add or

    subtract the value of the

    multiplicative_expressionparsed in this loop to the local variable

    result

    result is accumulated in the

    loop

    The resulting code for actually carrying out the addition inside the parsing loop is shown in the following code

    fragment:

    //code fragment taken from the template implementation of the rule expression

    doubleresult= pS->result;

    for(;;){/*(S ('+' / '-') S multiplicative_expression)*; */

  • 8/12/2019 Crafting an Interpreter Part 2 - The Calc0 Toy Language - VROSNET - CodeProject

    4/7

    ng an interpreter Part 2 - The Calc0 Toy Language - VROSNET - CodeProject

    /workspaces.codeproject.com/vrosnet/crafting-an-interpreter-part-2-the-calc0-toy-langu[19/06/2014 4:44:44 AM]

    S::Matches(p,pS);

    if( And::Matches(p,pS)){

    result+= pS->result;

    }else if( And::Matches(p,pS) ){

    result-= pS->result;

    }else...

    The code for the rule multiplicative_expression, where multiplications and divisions are carried out is

    similar. One major problem in direct evaluation arises for the grammar rule identifier. Depending on the place

    where an identifier is used, we must evaluate completely different things as shown in the following table.

    place of usage of identifier evaluation to be carried out

    identifier S '=' S

    expression ;

    store the name of the identifierin order to be able to store the

    value of expressionlater under this name.

    identifier / number / '('

    S expression S ')'

    store the value of identifieras operand for later to be carried out

    additions, multiplications and assignments.

    '!' S (identifier (S '-' S

    identifier)?)?

    store the names of the identifiers to determine the lexicographical

    beginning and end of the variables to be printed out.

    In the PEG parsing method, there is a one to one correspondence between grammar rule and parsing function. But

    we have to do very different things inside the parsing function for identifierdepending on the caller. One

    solution would be to check for the caller inside the parsing function for identifier. But there is a much simpler

    solution. Use different grammar rules for identifierdepending on the place where it is used. A similar problem

    arises for the rule expression. When used as top element (one of the alternatives for calc0_grammar) we

    should print out the calculated value, but not if the expression is on the left hand side of an assignment or is

    enclosed in parens ('(' expression ')'). We can solve this problem by introducing two equivalent

    grammar_rules, namely expressionand full_expression. In this way, we get the following revised calc0grammar:

    //Revised calc0_grammar

    calc0_grammar: S (full_expression / assignement / print_variables) S '\0';

    assignement: variable S '=' S expression ;

    print_variables: '!' S (identFirst (S '-' S identLast)?)? ;

    expression: multiplicative_expression

    (S ('+' / '-') S multiplicative_expression)* ;

    full_expression: expression ;

    multiplicative_expression:

    power_expression (S ('*' / '/') S power_expression)* ;power_expression: primary_expression (S '^' S primary_expression)? ;

    primary_expression: ('+' / '-')? S

    (identifier / number / '(' S expression S ')') ;

    number: ([0-9]+ '.'? / '.'[0-9])

    [0-9]* (('e' / 'E') [0-9]+)? ;

    identifier: [A-Za-z_][A-Za-z_0-9]* ;

    variable: identifier ;

    identFirst: identifier ;

    identLast: identifier ;

    S: [ \r\v\t]* ;

    Handling of Parsing Errors in PEG's

    An input for a parser is erroneous if an expected element is missing from the input and there is no alternative left.The PEG formalism allows to integrate error recognition into the grammar, as the following table shows:

    Error Situation Error Recognizing Rule

    Eis expected

    S: A B EU ;S: A B (E / Error) U ;

    One of the alternatives B or C must be present

    S: A (B / C )D ;S: A (B / C / Error) D ;

    The simplest kind of error handling in a PEG grammar is to stop the parsing process at the error position, give an

    error message, and throw an exception. We will use this technique here, because the input never exceeds one line.

    We enhance the grammar with indicators for error situations by appending an alternative Fatal.

  • 8/12/2019 Crafting an Interpreter Part 2 - The Calc0 Toy Language - VROSNET - CodeProject

    5/7

    ng an interpreter Part 2 - The Calc0 Toy Language - VROSNET - CodeProject

    /workspaces.codeproject.com/vrosnet/crafting-an-interpreter-part-2-the-calc0-toy-langu[19/06/2014 4:44:44 AM]

    //calc0_grammar with error handling indicators

    calc0_grammar: S

    ( print_variables /

    assignement /

    full_expression /

    Fatal;

    )

    S

    ( '\0' / Fatal) ;

    assignement: variable S '=' S expression ;

    print_variables: '!'

    S

    (

    identFirst

    S

    ( '-' S ( identLast / Fatal) )?

    )? ;

    expression: multiplicative_expression

    (S ('+' / '-') S multiplicative_expression)* ;

    full_expression: expression ;

    multiplicative_expression:

    power_expression (S ('*' / '/') S power_expression)* ;

    power_expression: primary_expression (S '^' S primary_expression)? ;

    primary_expression: ('+' / '-')? S

    ( identifier /

    number /

    '(' S expression S ( ')' / Fatal) /

    Fatal

    ) ;

    number: ([0-9]+ '.'? / '.' ([0-9] / Fatal) )

    [0-9]*

    ( ('e'|'E')

    ( '+' / '-' )?

    ([0-9]+ / Fatal)

    )?;

    identifier: [A-Za-z_][A-Za-z_0-9]* ;

    variable: identifier ;

    identFirst: identifier ;

    identLast: identifier ;

    S: [ \r\v\t]* ;

    In the PEG implementation, there is a one to one relation between the grammar rules given above (inclusive error

    handling) and the template instantiation. This is shown for the start rule for both the template and procedural

    implementation:

    calc0_grammar:

    S (print_variables / assignement / full_expression / Fatal) S ('\0' / Fatal) ;

    template implementation procedural implementation

    typedef

    AndFatal(p0,

    "expression, "

  • 8/12/2019 Crafting an Interpreter Part 2 - The Calc0 Toy Language - VROSNET - CodeProject

    6/7

    ng an interpreter Part 2 - The Calc0 Toy Language - VROSNET - CodeProject

    /workspaces.codeproject.com/vrosnet/crafting-an-interpreter-part-2-the-calc0-toy-langu[19/06/2014 4:44:44 AM]

    Or

    > rule;

    "v= expr or '!' expected")

    )

    && S(p,pS)

    && ( Char(p,'\0')

    ||pS->Fatal(p,

    "end of input expected")

    )

    );

    }

    Maintaining Information across RunsThis interpreter runs in an eternal loop of waiting for input, and then reading, parsing and evaluating it. Each time the

    user enters an assignment, it not only calculates the new value of the variable, which is directly involved into the

    assignment, but it also looks for other formulas which use the changed value and recalculates them. This information

    is kept in a map with the following value type:

    typedef std::set set_istring;

    structFormula{

    Formula(std::stringe="",doublev=0.0, constset_istring& sUsed=set_istring())

    :expression(e),value(v),setUsedIdents(sUsed){}

    std::string expression; //needed when recalculation necessary

    double value; //current value of the formula variable

    set_istring setUsedIdents;//identifiers used by the formula

    };

    The code responsible for calculating the new variable value and recalculating the dependent formulas is contained in

    the parser part for the rule assignement: variable S '=' S expression;. The variable, which is the

    target of the assignment, is stored with its value and name, the whole input line, and the variable names which

    appear on the right hand side of the assignment as Formulain the variablesmap.

    Handling of Runtime errors

    Runtime error handling is one of the most underestimated subjects. The two main questions in runtime error

    handling are:

    Amount of runtime error checks.

    Kind of error action: program abortion, raising an exception, setting a flag, or autocorrection.The expectations regarding the amount of runtime checks are different for interpreted and compiled languages and

    differs from language to language. Array bound checks, e.g., are standard in Pascal, but not available for C/C++.

    Even if the language specification defines a runtime error as severe, the programming language culture eventually

    may gently force the implementor to dispense with issuing a runtime error. A good example is integer overflow in

    C/C++, which results in undefined behaviour but will not be caught by most C/C++ implementations.

    The answer to the second question regarding runtime error handling, namely what kind of error action should be

    carried out, is currently biased towards "raising an exception". C++ and Java even define a complete inheritance tree

    of runtime exceptions in their standard libraries. The only notable exception, where throwing an exception may not

    be the right thing, is floating point error handling. IEEE 754 (the standard for floating point arithmetic) supports both,

    setting an error flag and taking a trap. Once a floating point error flag is set, it will stay until it is explicitly cleared.

    The standard stream library recognizes the IEEE 754 error flags and prints them out as in the sample shown below:

    3/0>> = 1.#INF

    1e300*-1e300

    >> = -1.#INF

    When a floating point overflow occurs, it may be meaningful to continue the computation by setting the floating

    point value to the maximum floating point value. This is done by the Calc0 interpreter when assigning to a variable

    in order to avoid illegal variable values in the variables map.

    x= 3/0

    >> x= 1.#INF

    .. floating point overflow (=DBL_MAX)

    >> x= 1.797693134862316e+308

  • 8/12/2019 Crafting an Interpreter Part 2 - The Calc0 Toy Language - VROSNET - CodeProject

    7/7

    ng an interpreter Part 2 - The Calc0 Toy Language - VROSNET - CodeProject

    vertise| Privacy| Pricing| Report problem| Forum| Twitter Copyright CodeProject, 19

    All Rights Reserved. Term

    layout: fixed| fluid

    The Calc0 interpreter recognizes floating point errors by using the non standard _fpclassfunction which returns

    error indicators like _FPCLASS_NINF(negative overflow), _FPCLASS_PINF(positive overflow), and so on. C99 [2]

    standardizes floating point error handling with additions to the math.hheader, but VC6 and VC7 do not yet support

    it.

    Take-Home Points

    Direct evaluation during parsing can be difficult because of the limited context information available during parsing.

    We can sometimes improve the situation by changing the grammar.

    Parsing errors can be handled easily within the PEG framework, because the components of a PEG grammar rule aresequentially evaluated, which makes execution order predictable.

    Runtime error handling must be adapted to the expectations of the target user group. Standards for runtime error

    handling are rare. An exception is floating point error handling.

    History

    2005 April: initial version.

    References

    1. Parsing Expression Grammars, Bryan Ford, MIT, January 2004[^]2. Parsing Expression Grammars, Wikipedia, the free encyclopedia[^]

    3. C99 Library Reference for [^]

    http://developermedia.com/http://www.codeproject.com/info/privacy.aspxhttps://workspaces.codeproject.com/pricinghttp://www.codeproject.com/Forums/1854922/Workspaces-Forum.aspxhttps://twitter.com/CodeProject_WShttps://twitter.com/CodeProject_WSmailto:[email protected]://www.codeproject.com/info/TermsOfUse.aspxhttp://www.brynosaurus.com/pub/lang/peg-slides/img0.htmlhttp://www.brynosaurus.com/pub/lang/peg-slides/img0.htmlhttp://www.brynosaurus.com/pub/lang/peg-slides/img0.htmlhttp://en.wikipedia.org/wiki/Parsing_expression_grammarhttp://en.wikipedia.org/wiki/Parsing_expression_grammarhttp://www.dinkumware.com/manuals/reader.aspx?b=c/&h=math.htmlhttp://www.dinkumware.com/manuals/reader.aspx?b=c/&h=math.htmlhttp://www.dinkumware.com/manuals/reader.aspx?b=c/&h=math.htmlhttp://www.dinkumware.com/manuals/reader.aspx?b=c/&h=math.htmlhttp://en.wikipedia.org/wiki/Parsing_expression_grammarhttp://en.wikipedia.org/wiki/Parsing_expression_grammarhttp://www.brynosaurus.com/pub/lang/peg-slides/img0.htmlhttp://www.brynosaurus.com/pub/lang/peg-slides/img0.htmlhttp://www.codeproject.com/info/TermsOfUse.aspxmailto:[email protected]://twitter.com/CodeProject_WShttps://twitter.com/CodeProject_WShttp://www.codeproject.com/Forums/1854922/Workspaces-Forum.aspxhttps://workspaces.codeproject.com/pricinghttp://www.codeproject.com/info/privacy.aspxhttp://developermedia.com/