lexeme generator 5th sem 2009 ppt

Post on 05-Apr-2015

192 Views

Category:

Documents

4 Downloads

Preview:

Click to see full reader

DESCRIPTION

The first phase of a compiler is called lexical analysis or scanning. The lexical analyzer reads the stream ofcharacters making up the source program and groups the characters into meaningful sequences calledlexemes. For each lexeme, the lexical analyzer produces as output a token of the form that it passes on to the subsequent phase, syntax analysis.

TRANSCRIPT

Project Guide: Project Members:

Mr. Manmohan Shukla Abhishek Bajpai

Devanshu GuptaHarshit SrivastavaKushagra Chawla

Introduction

› Lexeme Generator reads the source program character by character to produce tokens.

› Involves scanning the program to be compiled and recognizing the tokens making up the source statements.

› Designed to recognize keywords, operators, identifiers, constants, character strings, etc.

2

Role of Lexeme Generator

› First phase of translation

› Recognizes tokens and ignores white spaces & comments

› Generates token stream

› Error reporting

3

Terminology

› Lexemes, Tokens & Patterns Lexemes: The Smallest Logical Unit of the Program

Ex. Sequence of Characters 10.0, Roll, int….. Tokens: Classes of similar Lexemes are identified

as a tokenEx. Identifier, Keywords, Constants

Patterns: It is a rule which describes the token.

› Lexeme is matched against pattern to generate token.

4

Attribute for Tokens

› Lexeme Generator has to provide additional information when more than one pattern matches a lexeme.

› e.g. E = M * C ^ 2<ID, pointer to symbol- table entry for E><assign-op> no attribute needed<ID, pointer to symbol- table entry for M><mult-op><ID, pointer to symbol- table entry for C><exp-op><NUM, integer value 2>

5

Block Diagram

6

Lexeme Generator ParserSource

Program

Token

Get Next Token

SymbolTable

To SemanticAnalysis

7

Input Output

› Sequence of characters

› A series of tokens : Punctuation ( ) ; , [ ] Operators + - * := Keywords begin end if

while Identifiers SquareRoot String literals “press Enter” Character literals ‘x’ Numeric literals

- Integer: 123- Floating point: 45.23

8

Performance Issues

› Speed

• Lexical analysis can become bottleneck

• Minimize processing time per characterSkip blanks fastI/O is also an issue (read large blocks)

9

Design Constraints

• Implemented in ‘C’ Language • No Database Connectivity

• Operating System : Windows XP

10

SDLC Model

› Iterative waterfall model

11

Level 0 DFD

12

Level 1 DFD

13

Any Questions ?

top related