lex(1) and flex(1). lex public interface file *yyin; /* set before calling yylex() */ int yylex();...
TRANSCRIPT
lex(1) and flex(1)
Lex public interface
• FILE *yyin; /* set before calling yylex() */• int yylex(); /* call once per token */• char yytext[]; /* chars matched by yylex()
*/• int yywrap(); /* end-of-file handler */
.l file format
header
%%body
%%helper functions
Lex header
• C code inside %{ … %}– prototypes for helper functions– #include’s that #define integer token categories
• Macro definitions, e.g.letter [a-zA-Z]digit [0-9]ident {letter}({letter}|{digit})*
• Warning: macros are fraught with peril
Lex body
• Regular expressions with semantic actions“ “ { /* discard */ }{ident} { return IDENT; }“*” { return ASTERISK; }“.” { return PERIOD; }• Match the longest r.e. possible• Break ties with whichever appears first• If it fails to match: copy unmatched to stdout
Lex helper functions
• Follows rules of ordinary C code• Compute lexical attributes• Do stuff the regular expressions can’t do• Write a yywrap() to switch files on EOF
Lex regular expressions
• \c escapes for most operators• “s” match C string as-is (superescape)• r{m,n} match r between m and n times• r/s match r when s follows• ^r match r when at beginning of line• r$ match r when at end of line
struct token
struct token { int category; char *text; int linenumber; int column; char *filename; union literal value;}
“string removal tool”
%%“zap me”
whitespace trimmer
%%[ \t]+ putchar(‘ ‘);[ \t]+ /* drop entirely */
string replacement
%%username printf(“%s”, getlogin() );
Line/word counter
int lines=0, chars=0;%%\n++lines; ++chars;. ++chars;%%main() { yylex(); printf(“lines: %d chars: %d\n”, lines, chars);}
Example: C reals
• Is it: [0-9]*.[0-9]*• Is it: ([0-9]+.[0-9]* | [0-9]*.[0-9]+)