lexeme generator 5th sem 2009 ppt
DESCRIPTION
The first phase of a compiler is called lexical analysis or scanning. The lexical analyzer reads the stream ofcharacters making up the source program and groups the characters into meaningful sequences calledlexemes. For each lexeme, the lexical analyzer produces as output a token of the form that it passes on to the subsequent phase, syntax analysis.TRANSCRIPT
Project Guide: Project Members:
Mr. Manmohan Shukla Abhishek Bajpai
Devanshu GuptaHarshit SrivastavaKushagra Chawla
Introduction
› Lexeme Generator reads the source program character by character to produce tokens.
› Involves scanning the program to be compiled and recognizing the tokens making up the source statements.
› Designed to recognize keywords, operators, identifiers, constants, character strings, etc.
2
Role of Lexeme Generator
› First phase of translation
› Recognizes tokens and ignores white spaces & comments
› Generates token stream
› Error reporting
3
Terminology
› Lexemes, Tokens & Patterns Lexemes: The Smallest Logical Unit of the Program
Ex. Sequence of Characters 10.0, Roll, int….. Tokens: Classes of similar Lexemes are identified
as a tokenEx. Identifier, Keywords, Constants
Patterns: It is a rule which describes the token.
› Lexeme is matched against pattern to generate token.
4
Attribute for Tokens
› Lexeme Generator has to provide additional information when more than one pattern matches a lexeme.
› e.g. E = M * C ^ 2<ID, pointer to symbol- table entry for E><assign-op> no attribute needed<ID, pointer to symbol- table entry for M><mult-op><ID, pointer to symbol- table entry for C><exp-op><NUM, integer value 2>
5
Block Diagram
6
Lexeme Generator ParserSource
Program
Token
Get Next Token
SymbolTable
To SemanticAnalysis
7
Input Output
› Sequence of characters
› A series of tokens : Punctuation ( ) ; , [ ] Operators + - * := Keywords begin end if
while Identifiers SquareRoot String literals “press Enter” Character literals ‘x’ Numeric literals
- Integer: 123- Floating point: 45.23
8
Performance Issues
› Speed
• Lexical analysis can become bottleneck
• Minimize processing time per characterSkip blanks fastI/O is also an issue (read large blocks)
9
Design Constraints
• Implemented in ‘C’ Language • No Database Connectivity
• Operating System : Windows XP
10
SDLC Model
› Iterative waterfall model
11
Level 0 DFD
12
Level 1 DFD
13
Any Questions ?