doctor syntax was the fictitious hero of several 19th ...ryan/cse4250/notes/syntax.pdf · doctor...
TRANSCRIPT
![Page 1: Doctor Syntax was the fictitious hero of several 19th ...ryan/cse4250/notes/syntax.pdf · Doctor Syntax was the fictitious hero of several 19th century satirical ... Post Systems](https://reader031.vdocuments.site/reader031/viewer/2022022503/5ab0f4587f8b9a6b468be793/html5/thumbnails/1.jpg)
![Page 2: Doctor Syntax was the fictitious hero of several 19th ...ryan/cse4250/notes/syntax.pdf · Doctor Syntax was the fictitious hero of several 19th century satirical ... Post Systems](https://reader031.vdocuments.site/reader031/viewer/2022022503/5ab0f4587f8b9a6b468be793/html5/thumbnails/2.jpg)
Doctor Syntax was the fictitious hero of several 19th century satiricalpoems by William Combe (1742–1823). The publications wereillustrated by Thomas Rowlandson (1756–1827).
Rowlandson’s many comic illustrations offered humorouscommentary on the political and social conventions of hisday. Rowlandson is perhaps best known for his animateddrawings of the mis-adventures of the infamous Dr. Syntax.The various escapades of the fictional English clergymanand schoolmaster, Dr. Syntax, are chronicled in threevolumes or “Tours.” The Tours of Dr. Syntax were satires ona then contemporary series of picturesque adventures tovarious parts of England with an emphasis on landscapeillustration and poetic verse.
I The tour of Doctor Syntax in search of the picturesque
I The second tour of Doctor Syntax in search of consolation
I The third tour of Doctor Syntax in search of a wife
![Page 3: Doctor Syntax was the fictitious hero of several 19th ...ryan/cse4250/notes/syntax.pdf · Doctor Syntax was the fictitious hero of several 19th century satirical ... Post Systems](https://reader031.vdocuments.site/reader031/viewer/2022022503/5ab0f4587f8b9a6b468be793/html5/thumbnails/3.jpg)
Overview of Syntax
I Study of Language — page ??I Structure — page ??I Formal Languages — page ??I Regular Expressions — page ??I BNF, ∗ Context Free Grammar, parse trees — page ??I ∗Attribute Grammars, ∗Post Systems
![Page 4: Doctor Syntax was the fictitious hero of several 19th ...ryan/cse4250/notes/syntax.pdf · Doctor Syntax was the fictitious hero of several 19th century satirical ... Post Systems](https://reader031.vdocuments.site/reader031/viewer/2022022503/5ab0f4587f8b9a6b468be793/html5/thumbnails/4.jpg)
Semiotics
Semiotics. Date: 1880. A general philosophical theory of signs andsymbols that deals especially with their function in both artificiallyconstructed and natural languages. The application of linguisticmethods to other than natural languages. A philosophy thatemphasizes the significance of language as universal.In his 1946 book, Signs, Language, and Behavior, Americanphilosopher Charles William Morris (1901–1979) gave thesedefinitions for the branches of the science of signs
I pragmatics — origin, uses and effects of signs with the behaviorin which they occur;
I semantics — the signification of signs in all modes of signifying;
I syntax — combination of signs without regard for their specificsignifications or their relations to the behavior in which they occur.
![Page 5: Doctor Syntax was the fictitious hero of several 19th ...ryan/cse4250/notes/syntax.pdf · Doctor Syntax was the fictitious hero of several 19th century satirical ... Post Systems](https://reader031.vdocuments.site/reader031/viewer/2022022503/5ab0f4587f8b9a6b468be793/html5/thumbnails/5.jpg)
Study of Language
I syntax is the form, structure of language
I semantics is the meaning of language
I pragmatics is origin, use, context of language
Distinction between semantics and syntax:
1. number versus numeral
2. character versus glyph
3. conjunction versus ∧
![Page 6: Doctor Syntax was the fictitious hero of several 19th ...ryan/cse4250/notes/syntax.pdf · Doctor Syntax was the fictitious hero of several 19th century satirical ... Post Systems](https://reader031.vdocuments.site/reader031/viewer/2022022503/5ab0f4587f8b9a6b468be793/html5/thumbnails/6.jpg)
Syntax in NL
’Twas brillig, and the slithy tovesDid gyre and gimble in the wabe;
All mimsy were the borogoves,And the mome raths outgrabe.
Lewis Carroll, “Jabberwocky”
We know certain things—for instance, that all this happened in thepast and that there was more than one tove but only one wabe. Weknow these things because of the grammatical structure of the Englishlanguage.
![Page 7: Doctor Syntax was the fictitious hero of several 19th ...ryan/cse4250/notes/syntax.pdf · Doctor Syntax was the fictitious hero of several 19th century satirical ... Post Systems](https://reader031.vdocuments.site/reader031/viewer/2022022503/5ab0f4587f8b9a6b468be793/html5/thumbnails/7.jpg)
Syntax
Can you translate the poem “Jabberwocky” into other languages?
’Twas BRKPT and the I/O queueWas SYMMING FASTRAND like the wind.
All idle was the CAUAs the last run had just FINNED.
“Beware the UNIVAC, my son,Its FASTRAND and its high-speed drum,
And FIELDATA, and listen forThe CTMC’s hum.”
Harry Gilbert, mid 1970s
![Page 8: Doctor Syntax was the fictitious hero of several 19th ...ryan/cse4250/notes/syntax.pdf · Doctor Syntax was the fictitious hero of several 19th century satirical ... Post Systems](https://reader031.vdocuments.site/reader031/viewer/2022022503/5ab0f4587f8b9a6b468be793/html5/thumbnails/8.jpg)
Syntax
Can you translate the poem “Jabberwocky” into other languages?
’Twas BRKPT and the I/O queueWas SYMMING FASTRAND like the wind.
All idle was the CAUAs the last run had just FINNED.
“Beware the UNIVAC, my son,Its FASTRAND and its high-speed drum,
And FIELDATA, and listen forThe CTMC’s hum.”
Harry Gilbert, mid 1970s
![Page 9: Doctor Syntax was the fictitious hero of several 19th ...ryan/cse4250/notes/syntax.pdf · Doctor Syntax was the fictitious hero of several 19th century satirical ... Post Systems](https://reader031.vdocuments.site/reader031/viewer/2022022503/5ab0f4587f8b9a6b468be793/html5/thumbnails/9.jpg)
Arpawocky
Also, Arpawocky, RFC527, 22 June 1973, by D. L. Covill
Twas brillig, and the ProtocolsDid USER-SERVER in the wabe.
All mimsey was the FTP,And the RJE outgrabe,
Beware the ARPANET, my son;The bits that byte, the heads that scratch;
Beware the NCP, and shunthe frumious system patch,
![Page 10: Doctor Syntax was the fictitious hero of several 19th ...ryan/cse4250/notes/syntax.pdf · Doctor Syntax was the fictitious hero of several 19th century satirical ... Post Systems](https://reader031.vdocuments.site/reader031/viewer/2022022503/5ab0f4587f8b9a6b468be793/html5/thumbnails/10.jpg)
Structure
The arrangement of elements of an entity and their relationships toeach other.
I Linear structure – like links of a chain
I Hierarchical structure – like nested mixing bowls, or boxes withinboxes
I Trees – like family tree, or railroad yard
Grammars are a way of describing these and more complex structures.
![Page 11: Doctor Syntax was the fictitious hero of several 19th ...ryan/cse4250/notes/syntax.pdf · Doctor Syntax was the fictitious hero of several 19th century satirical ... Post Systems](https://reader031.vdocuments.site/reader031/viewer/2022022503/5ab0f4587f8b9a6b468be793/html5/thumbnails/11.jpg)
Structure
![Page 12: Doctor Syntax was the fictitious hero of several 19th ...ryan/cse4250/notes/syntax.pdf · Doctor Syntax was the fictitious hero of several 19th century satirical ... Post Systems](https://reader031.vdocuments.site/reader031/viewer/2022022503/5ab0f4587f8b9a6b468be793/html5/thumbnails/12.jpg)
Structure
Even though it is invisible, structure has great importance, evenmeaning. Sentence structure is crucial to disambiguate the followingsentences.
1. The chicken is ready to eat.
Who is eating?
2. He saw that gasoline can explode.What is exploding?
3. Flying planes can be dangerous.What is dangerous?
![Page 13: Doctor Syntax was the fictitious hero of several 19th ...ryan/cse4250/notes/syntax.pdf · Doctor Syntax was the fictitious hero of several 19th century satirical ... Post Systems](https://reader031.vdocuments.site/reader031/viewer/2022022503/5ab0f4587f8b9a6b468be793/html5/thumbnails/13.jpg)
Structure
Even though it is invisible, structure has great importance, evenmeaning. Sentence structure is crucial to disambiguate the followingsentences.
1. The chicken is ready to eat.Who is eating?
2. He saw that gasoline can explode.
What is exploding?
3. Flying planes can be dangerous.What is dangerous?
![Page 14: Doctor Syntax was the fictitious hero of several 19th ...ryan/cse4250/notes/syntax.pdf · Doctor Syntax was the fictitious hero of several 19th century satirical ... Post Systems](https://reader031.vdocuments.site/reader031/viewer/2022022503/5ab0f4587f8b9a6b468be793/html5/thumbnails/14.jpg)
Structure
Even though it is invisible, structure has great importance, evenmeaning. Sentence structure is crucial to disambiguate the followingsentences.
1. The chicken is ready to eat.Who is eating?
2. He saw that gasoline can explode.What is exploding?
3. Flying planes can be dangerous.
What is dangerous?
![Page 15: Doctor Syntax was the fictitious hero of several 19th ...ryan/cse4250/notes/syntax.pdf · Doctor Syntax was the fictitious hero of several 19th century satirical ... Post Systems](https://reader031.vdocuments.site/reader031/viewer/2022022503/5ab0f4587f8b9a6b468be793/html5/thumbnails/15.jpg)
Structure
Even though it is invisible, structure has great importance, evenmeaning. Sentence structure is crucial to disambiguate the followingsentences.
1. The chicken is ready to eat.Who is eating?
2. He saw that gasoline can explode.What is exploding?
3. Flying planes can be dangerous.What is dangerous?
![Page 16: Doctor Syntax was the fictitious hero of several 19th ...ryan/cse4250/notes/syntax.pdf · Doctor Syntax was the fictitious hero of several 19th century satirical ... Post Systems](https://reader031.vdocuments.site/reader031/viewer/2022022503/5ab0f4587f8b9a6b468be793/html5/thumbnails/16.jpg)
programminglanguages
language design,description, semantics
compilers
implementation,recognition
formallanguages
expressivity
![Page 17: Doctor Syntax was the fictitious hero of several 19th ...ryan/cse4250/notes/syntax.pdf · Doctor Syntax was the fictitious hero of several 19th century satirical ... Post Systems](https://reader031.vdocuments.site/reader031/viewer/2022022503/5ab0f4587f8b9a6b468be793/html5/thumbnails/17.jpg)
formallanguages,automata
models
computabilty
limits
complexity
efficiency
![Page 18: Doctor Syntax was the fictitious hero of several 19th ...ryan/cse4250/notes/syntax.pdf · Doctor Syntax was the fictitious hero of several 19th century satirical ... Post Systems](https://reader031.vdocuments.site/reader031/viewer/2022022503/5ab0f4587f8b9a6b468be793/html5/thumbnails/18.jpg)
Bad/Confusing Syntax In PL
The science of syntax is about unambiguous communication and assuch is neutral. Let us at least be clear.Hard to define “good” and “bad” syntax. Only experience helps, andthat takes time. The lessons of programming language design are adhoc.
![Page 19: Doctor Syntax was the fictitious hero of several 19th ...ryan/cse4250/notes/syntax.pdf · Doctor Syntax was the fictitious hero of several 19th century satirical ... Post Systems](https://reader031.vdocuments.site/reader031/viewer/2022022503/5ab0f4587f8b9a6b468be793/html5/thumbnails/19.jpg)
Bad/Confusing Syntax In PL
DO 10 I = 1.5A(I) = X + B(I)
10 CONTINUE
IF IF=THEN THEN THEN=ELSE; ELSE ELSE=IFIF IF THEN THEN=ELSE; ELSE ELSE=0;
while ( ... ) {...break;...
}
![Page 20: Doctor Syntax was the fictitious hero of several 19th ...ryan/cse4250/notes/syntax.pdf · Doctor Syntax was the fictitious hero of several 19th century satirical ... Post Systems](https://reader031.vdocuments.site/reader031/viewer/2022022503/5ab0f4587f8b9a6b468be793/html5/thumbnails/20.jpg)
Bad Syntax In Compilers
Not LL but is LR:
stm ::= "if" expr "then" stm "else" stm| "if" expr "then" stm
cf. Jacc FAQ 8.1
![Page 21: Doctor Syntax was the fictitious hero of several 19th ...ryan/cse4250/notes/syntax.pdf · Doctor Syntax was the fictitious hero of several 19th century satirical ... Post Systems](https://reader031.vdocuments.site/reader031/viewer/2022022503/5ab0f4587f8b9a6b468be793/html5/thumbnails/21.jpg)
Lexical Structure Versus Phrase Structure
In lexical structure the alphabet used is a set of characters (e.g,Latin1) and the goal is to categorize substring into different tokenseach described by their own language.In phrase structure, the alphabet used is a set tokens, and one goal isto determine if the program is syntactically correct (i.e., is the phrase inthe language or not). Actually the goal is more broad than that. Thegoal is to identify the phrase structure or derivation of syntacticallycorrect programs.
![Page 22: Doctor Syntax was the fictitious hero of several 19th ...ryan/cse4250/notes/syntax.pdf · Doctor Syntax was the fictitious hero of several 19th century satirical ... Post Systems](https://reader031.vdocuments.site/reader031/viewer/2022022503/5ab0f4587f8b9a6b468be793/html5/thumbnails/22.jpg)
Three Views of a Program
![Page 23: Doctor Syntax was the fictitious hero of several 19th ...ryan/cse4250/notes/syntax.pdf · Doctor Syntax was the fictitious hero of several 19th century satirical ... Post Systems](https://reader031.vdocuments.site/reader031/viewer/2022022503/5ab0f4587f8b9a6b468be793/html5/thumbnails/23.jpg)
Program asStream of Characters
![Page 24: Doctor Syntax was the fictitious hero of several 19th ...ryan/cse4250/notes/syntax.pdf · Doctor Syntax was the fictitious hero of several 19th century satirical ... Post Systems](https://reader031.vdocuments.site/reader031/viewer/2022022503/5ab0f4587f8b9a6b468be793/html5/thumbnails/24.jpg)
Program asStream of Tokens
![Page 25: Doctor Syntax was the fictitious hero of several 19th ...ryan/cse4250/notes/syntax.pdf · Doctor Syntax was the fictitious hero of several 19th century satirical ... Post Systems](https://reader031.vdocuments.site/reader031/viewer/2022022503/5ab0f4587f8b9a6b468be793/html5/thumbnails/25.jpg)
Program asTree Structure
![Page 26: Doctor Syntax was the fictitious hero of several 19th ...ryan/cse4250/notes/syntax.pdf · Doctor Syntax was the fictitious hero of several 19th century satirical ... Post Systems](https://reader031.vdocuments.site/reader031/viewer/2022022503/5ab0f4587f8b9a6b468be793/html5/thumbnails/26.jpg)
Whitespace
Free-format as opposed to one statement per line. In earlier versionsof FORTRAN a fixed lexical structure of one statement per punch cardstating on column 7. COBOL and Basic also used to be fixed format.For the most part languages today permit the programmer to choosethe white space (and hence the layout) of the program. Some layoutsare clearly more legible and others, so the need for style guidelines.Python, Haskell, and Occam require proper indenting.
Should a language specify the layout (use of white space)?
![Page 27: Doctor Syntax was the fictitious hero of several 19th ...ryan/cse4250/notes/syntax.pdf · Doctor Syntax was the fictitious hero of several 19th century satirical ... Post Systems](https://reader031.vdocuments.site/reader031/viewer/2022022503/5ab0f4587f8b9a6b468be793/html5/thumbnails/27.jpg)
Formal Language Versus Natural Language
Natural language is full of unresolvable differences and “jaggededges.” (This makes it rich and interesting.)
![Page 28: Doctor Syntax was the fictitious hero of several 19th ...ryan/cse4250/notes/syntax.pdf · Doctor Syntax was the fictitious hero of several 19th century satirical ... Post Systems](https://reader031.vdocuments.site/reader031/viewer/2022022503/5ab0f4587f8b9a6b468be793/html5/thumbnails/28.jpg)
Formal Language Versus Natural Language
![Page 29: Doctor Syntax was the fictitious hero of several 19th ...ryan/cse4250/notes/syntax.pdf · Doctor Syntax was the fictitious hero of several 19th century satirical ... Post Systems](https://reader031.vdocuments.site/reader031/viewer/2022022503/5ab0f4587f8b9a6b468be793/html5/thumbnails/29.jpg)
Formal Language
First some definitions:
I object language, the language being studied and meta language,the language used to communicate
I alphabet is a set of atomic symbols
I strings is a sequence of symbols
I formal language is a set of strings
Unlike natural languages, it is unambiguous for a formal language L ifs ∈ L or s /∈ L.The key challenge is to find good (finite) representation of (infinite)sets.
![Page 30: Doctor Syntax was the fictitious hero of several 19th ...ryan/cse4250/notes/syntax.pdf · Doctor Syntax was the fictitious hero of several 19th century satirical ... Post Systems](https://reader031.vdocuments.site/reader031/viewer/2022022503/5ab0f4587f8b9a6b468be793/html5/thumbnails/30.jpg)
Object and Meta Language
Object language versus meta-language. The object language is thelanguage being studied, and the meta-language is the language beingused to communicate.For example, in the class French 101 one may study the Frenchlanguage, it is the object language. To communicate to the students itmay be necessary for the teacher to speak in English. English is themeta-language in this case.
![Page 31: Doctor Syntax was the fictitious hero of several 19th ...ryan/cse4250/notes/syntax.pdf · Doctor Syntax was the fictitious hero of several 19th century satirical ... Post Systems](https://reader031.vdocuments.site/reader031/viewer/2022022503/5ab0f4587f8b9a6b468be793/html5/thumbnails/31.jpg)
AlphabetAlphabet. The alphabet is the finite set of symbols comprising theformal language. These are the basic, atomic, distinct elements.
Σ = {a,b,c,d}Σ = {0,1}Σ = {a,b, . . . ,z,0,1, . . . ,9}Σ = Latin-1
Σ = Unicode
Σ = {if, then, (, ), ’,’, . . .}
Theoreticians like to use Σ = {0,1} because it is so simple.Sometimes the symbols of the alphabet may be chosen in a way tosuggest their meaning or use. The symbols might even be composedof other sub-atomic symbols. Important character sets forprogramming languages to use and be written in are Latin-1, Latin-0,and Unicode, not just US-ASCII.
![Page 32: Doctor Syntax was the fictitious hero of several 19th ...ryan/cse4250/notes/syntax.pdf · Doctor Syntax was the fictitious hero of several 19th century satirical ... Post Systems](https://reader031.vdocuments.site/reader031/viewer/2022022503/5ab0f4587f8b9a6b468be793/html5/thumbnails/32.jpg)
ISO 8859-13 (Latin-0)
NUL SOH STX ETX EOT ENQ ACK BEL BS HT LF VT FF CR SS SIDLE DC1 DC2 DC3 DC4 NAK SYN ETP CAN EM SUB ESC FS GS RS US
! " # $ % & ’ ( ) * + , - . /0 1 2 3 4 5 6 7 8 9 : ; < = > ?@ A B C D E F G H I J K L M N OP Q R S T U V W X Y Z [ \ ] ˆ _` a b c d e f g h i j k l m n op q r s t u v w x y z { | } ˜ DEL
XXX XXX BPH NBH IND NEL SSA ESA HTS HTJ VTS PLD PLU RI SS2 SS3DCS PU1 PU2 SST CCH MS SPA EPA SOS XXX SCI CSI ST OSC PM APC
¡ ¢ £ ¥ S § s © ª « ¬ SHY ® ¯° ± 2 3 Z µ ¶ · z 1 º » Œ œ Y ¿A A A A A A Æ C E E E E I I I IÐ N O O O O O × Ø U U U U Y Þ ßa a a a a a æ c e e e e ı ı ı ıd n o o o o o ÷ ø u u u u y þ y
![Page 33: Doctor Syntax was the fictitious hero of several 19th ...ryan/cse4250/notes/syntax.pdf · Doctor Syntax was the fictitious hero of several 19th century satirical ... Post Systems](https://reader031.vdocuments.site/reader031/viewer/2022022503/5ab0f4587f8b9a6b468be793/html5/thumbnails/33.jpg)
Unicode Supports Many Scripts
![Page 34: Doctor Syntax was the fictitious hero of several 19th ...ryan/cse4250/notes/syntax.pdf · Doctor Syntax was the fictitious hero of several 19th century satirical ... Post Systems](https://reader031.vdocuments.site/reader031/viewer/2022022503/5ab0f4587f8b9a6b468be793/html5/thumbnails/34.jpg)
Strings
String. Given an alphabet Σ = {a,b,c,d}. A string is an (ordered)sequence of symbols from the alphabet. E.g., aaa, bac, d .The sequence may contain no symbols, this empty string is given aspecial notation "". (Some authors use ε, the Greek letter epsilon.)The most important operation on strings is string concatenation whichis sometimes denoted by the · operator. E.g., ab ·db = abdb,da ·ba = daba.
![Page 35: Doctor Syntax was the fictitious hero of several 19th ...ryan/cse4250/notes/syntax.pdf · Doctor Syntax was the fictitious hero of several 19th century satirical ... Post Systems](https://reader031.vdocuments.site/reader031/viewer/2022022503/5ab0f4587f8b9a6b468be793/html5/thumbnails/35.jpg)
More Strings
Given an alphabet Σ = {0,1}, the following are strings: 0, ε, 111, 010,1.Given an alphabet Σ = {a,b, . . . ,z}, the following are strings: ε, x ,aeiou, hello, biopsy .Using Latin-1 as the alphabet, the following are strings: ¡Hola!,Hola!, ¿Que pasa?, "Hi. Bye", großtergeminsamer Teiler.Using the alphabet Σ
{if, then, else, begin, end, (, ), ‘,’}
the following are strings:1. else )
2. if then
3. begin begin , , end end
![Page 36: Doctor Syntax was the fictitious hero of several 19th ...ryan/cse4250/notes/syntax.pdf · Doctor Syntax was the fictitious hero of several 19th century satirical ... Post Systems](https://reader031.vdocuments.site/reader031/viewer/2022022503/5ab0f4587f8b9a6b468be793/html5/thumbnails/36.jpg)
Formal Language
Formal language. A formal language is a set of strings over somealphabet.Given an alphabet Σ = {0,1}, the following are formal languages:
L0 = /0
L1 = {1}L2 = {1,0101}L3 = {0,01,00}L4 = {1,01,11,001,011,101,111, . . .}L5 = {ε,1,11,111,1111, . . .}L6 = {ε,0,1,00,01,10,11,000, . . .}= Σ∗
The collection of legal programs in a programming language is also anexample of a formal language.
![Page 37: Doctor Syntax was the fictitious hero of several 19th ...ryan/cse4250/notes/syntax.pdf · Doctor Syntax was the fictitious hero of several 19th century satirical ... Post Systems](https://reader031.vdocuments.site/reader031/viewer/2022022503/5ab0f4587f8b9a6b468be793/html5/thumbnails/37.jpg)
Chomsky Hierarchy
![Page 38: Doctor Syntax was the fictitious hero of several 19th ...ryan/cse4250/notes/syntax.pdf · Doctor Syntax was the fictitious hero of several 19th century satirical ... Post Systems](https://reader031.vdocuments.site/reader031/viewer/2022022503/5ab0f4587f8b9a6b468be793/html5/thumbnails/38.jpg)
Chomsky Hierarchy
name machine
0 unrestricted Turing machine, Post systems1 context-sensitive linear-bounded automata2 context-free pushdown automata, BNF3 regular finite automata, regular expression
I Regular expressions are used to describe tokens in programminglanguages
I BNF, CFG used to describe syntax of programming languages
I [Context-sensitive languages play no role in programminglanguages]
I attribute grammars, Post systems used in semantics
![Page 39: Doctor Syntax was the fictitious hero of several 19th ...ryan/cse4250/notes/syntax.pdf · Doctor Syntax was the fictitious hero of several 19th century satirical ... Post Systems](https://reader031.vdocuments.site/reader031/viewer/2022022503/5ab0f4587f8b9a6b468be793/html5/thumbnails/39.jpg)
Description Mechanisms1. Regular expressions (Appel, page 19, but not in Webber or
Mitchell; briefly mentioned in Sebesta)2. BNF (Chapters 2 and three of Webber)
“The programming language Ada expresses computation.”“The programming language Ada is notation for computation.”“A program in Java denotes a computation or algorithm.”“Regular expressions express formal languages (sets of strings).”“Regular expressions are notation for formal languages.”“A regular expression denotes a formal language.”“BNF expresses formal languages.”“BNF is notation for formal languages.”“A BNF definition denotes a formal language.”Now consider: a programming language (like Ada) is a formallanguage.We can try to describe Ada using regular expressions or BNF.
Jeffrey E. F. Friedl. Mastering Regular Expressions, second edition.O’Reilly: Sepastapol, California, 2002. ISBN: 0-596-00289-0.
![Page 40: Doctor Syntax was the fictitious hero of several 19th ...ryan/cse4250/notes/syntax.pdf · Doctor Syntax was the fictitious hero of several 19th century satirical ... Post Systems](https://reader031.vdocuments.site/reader031/viewer/2022022503/5ab0f4587f8b9a6b468be793/html5/thumbnails/40.jpg)
Regular Expressions
Some people, when confronted with a problem think “I know,I’ll use regular expressions.” Now they have two problems.
From a post by Jamie Zawinski to alt.religion.emacs in 1997.
![Page 41: Doctor Syntax was the fictitious hero of several 19th ...ryan/cse4250/notes/syntax.pdf · Doctor Syntax was the fictitious hero of several 19th century satirical ... Post Systems](https://reader031.vdocuments.site/reader031/viewer/2022022503/5ab0f4587f8b9a6b468be793/html5/thumbnails/41.jpg)
Regular Expressions (Syntax)
1. Empty. /0
2. Atom. Any single symbol of a ∈ Σ is a regular expression.
3. Alternation. If r1 is a regular expression and r2 is a regularexpression, then (r1 + r2) is a regular expression. [Maybe (r1|r2)is better.]
4. Concatenation. If r1 and r2 are regular expressions, then (r1 · r2)is a regular expression.
5. Closure. If r is a regular expression, then r∗ is a regularexpression.
![Page 42: Doctor Syntax was the fictitious hero of several 19th ...ryan/cse4250/notes/syntax.pdf · Doctor Syntax was the fictitious hero of several 19th century satirical ... Post Systems](https://reader031.vdocuments.site/reader031/viewer/2022022503/5ab0f4587f8b9a6b468be793/html5/thumbnails/42.jpg)
Regular Expressions (Examples)
a ·ba + ba · (b + c)a + (b · c)a∗
a + (b · c)∗
(a + (b · c))∗
(using the alphabet Σ = {a,b,c})
![Page 43: Doctor Syntax was the fictitious hero of several 19th ...ryan/cse4250/notes/syntax.pdf · Doctor Syntax was the fictitious hero of several 19th century satirical ... Post Systems](https://reader031.vdocuments.site/reader031/viewer/2022022503/5ab0f4587f8b9a6b468be793/html5/thumbnails/43.jpg)
Regular Expressions (Examples)
(Omitting the “centered dot” and using the | for alternation.)
aba | ba(b | c)a | (bc)a∗
a | (bc)∗
(a | (bc))∗
(using the alphabet Σ = {a,b,c})
![Page 44: Doctor Syntax was the fictitious hero of several 19th ...ryan/cse4250/notes/syntax.pdf · Doctor Syntax was the fictitious hero of several 19th century satirical ... Post Systems](https://reader031.vdocuments.site/reader031/viewer/2022022503/5ab0f4587f8b9a6b468be793/html5/thumbnails/44.jpg)
Regular Expressions (Examples)
011011 + 01(0 + 1)0 + (10)0∗
1 + (01)∗
(0 + (10)∗
((0 + (10)∗)∗
(using the alphabet Σ = {0,1} and omitting the · symbol)
![Page 45: Doctor Syntax was the fictitious hero of several 19th ...ryan/cse4250/notes/syntax.pdf · Doctor Syntax was the fictitious hero of several 19th century satirical ... Post Systems](https://reader031.vdocuments.site/reader031/viewer/2022022503/5ab0f4587f8b9a6b468be793/html5/thumbnails/45.jpg)
Regular Expressions (Semantics)
That is what regular expressions look like (syntax). What do regularexpressions mean (semantics)?Each regular expression denotes a formal language (a set of strings).There are an infinite number of regular expressions. (Each one isfinitely constructed.) Just like programs in programming languages.We must find a way to define the meaning or denotation of eachregular expression.So, we define a (recursive) function from regular expressions (aspieces of syntax) to sets of strings (languages).
![Page 46: Doctor Syntax was the fictitious hero of several 19th ...ryan/cse4250/notes/syntax.pdf · Doctor Syntax was the fictitious hero of several 19th century satirical ... Post Systems](https://reader031.vdocuments.site/reader031/viewer/2022022503/5ab0f4587f8b9a6b468be793/html5/thumbnails/46.jpg)
Regular Expressions (Semantics)
1. Empty. The language with no strings.
2. Atom. The language with one string of length one–a itself. Notethat the notation a is ambiguous. It stands for both the symboland the language.
3. Alternation. (r1 + r2) is the union of two languages.
4. Concatenation. (r1 · r2) is the set all strings beginning in onelanguage and then followed by a string in the second.
5. Closure. r∗ is zero, one, or more strings.
![Page 47: Doctor Syntax was the fictitious hero of several 19th ...ryan/cse4250/notes/syntax.pdf · Doctor Syntax was the fictitious hero of several 19th century satirical ... Post Systems](https://reader031.vdocuments.site/reader031/viewer/2022022503/5ab0f4587f8b9a6b468be793/html5/thumbnails/47.jpg)
Regular Expressions (Examples)
a ·b = {ab}a + b = {a,b}
a · (b + c) = {ab,ac}a + (b · c) = {a,bc}
a∗ = {ε,a,aa,aaa, . . .}a + (b · c)∗ = {ε,a,bc,bcbc, . . .}
(a + (b · c))∗ = {ε,a,bc,aa,abc,bcbc, . . .}
(using the alphabet Σ = {a,b,c})
![Page 48: Doctor Syntax was the fictitious hero of several 19th ...ryan/cse4250/notes/syntax.pdf · Doctor Syntax was the fictitious hero of several 19th century satirical ... Post Systems](https://reader031.vdocuments.site/reader031/viewer/2022022503/5ab0f4587f8b9a6b468be793/html5/thumbnails/48.jpg)
Other Definitions
Some authors replace one of the five cases of the definition
1. Empty. /0 denoting the language with no strings
with another case
1. Epsilon. ε denoting the set consisting of the single string “”
Do you see the difference?
Notice the ε is superfluous as {ε} is represented by /0∗. Can you provethis?
![Page 49: Doctor Syntax was the fictitious hero of several 19th ...ryan/cse4250/notes/syntax.pdf · Doctor Syntax was the fictitious hero of several 19th century satirical ... Post Systems](https://reader031.vdocuments.site/reader031/viewer/2022022503/5ab0f4587f8b9a6b468be793/html5/thumbnails/49.jpg)
∗Regular Expressions (Semantics)
1. Empty.
DJ /0K = {}
2. Atom.
DJa1K = {a1}...
DJanK = {an}
3. Alternation.
DJ(r1 + r2)K = DJr1K∪DJr2K
![Page 50: Doctor Syntax was the fictitious hero of several 19th ...ryan/cse4250/notes/syntax.pdf · Doctor Syntax was the fictitious hero of several 19th century satirical ... Post Systems](https://reader031.vdocuments.site/reader031/viewer/2022022503/5ab0f4587f8b9a6b468be793/html5/thumbnails/50.jpg)
∗Regular Expressions (Semantics)
4. Concatenation.
DJ(r1 · r2)K = {x · y | x ∈DJr1K, y ∈DJr2K}
where x · y is string concatenation.
5. Closure.DJr∗K =
⋃i
(DJrK)i
where Si is defined recursively as follows:
S0 = {ε}Si+1 =
{x · y | x ∈ S, y ∈ Si}
![Page 51: Doctor Syntax was the fictitious hero of several 19th ...ryan/cse4250/notes/syntax.pdf · Doctor Syntax was the fictitious hero of several 19th century satirical ... Post Systems](https://reader031.vdocuments.site/reader031/viewer/2022022503/5ab0f4587f8b9a6b468be793/html5/thumbnails/51.jpg)
∗Using the Definitions
DJ(a ·b)K = {x · y | x ∈DJaK, y ∈DJbK}= {x · y | x ∈ {a}, y ∈ {b}}= {a ·b}= {ab}
DJ(a + (a ·b))K = DJaK∪DJ(a ·b)K= {a}∪{ab}= {a, ab}
DJ(a + (b · c))∗K =⋃
i
(DJ(a + (b · c))K)i
= {ε}∪{a, bc}∪DJ(a + (b · c))K2∪ . . .= {ε, a, bc}∪{aa,abc,bca,bcbc}∪ . . .= {ε, a, aa, bc, abc, bca, bcbc}∪ . . .= {ε,a,aa,bc,aaa,abc,bca,aabc,abca,bcaa,bcbc, . . .}
![Page 52: Doctor Syntax was the fictitious hero of several 19th ...ryan/cse4250/notes/syntax.pdf · Doctor Syntax was the fictitious hero of several 19th century satirical ... Post Systems](https://reader031.vdocuments.site/reader031/viewer/2022022503/5ab0f4587f8b9a6b468be793/html5/thumbnails/52.jpg)
∗ Digression: Induction
Is the recursively defined function D well-defined? Yes.When are such recursively defined functions well-defined? Overfree-generated sets.
![Page 53: Doctor Syntax was the fictitious hero of several 19th ...ryan/cse4250/notes/syntax.pdf · Doctor Syntax was the fictitious hero of several 19th century satirical ... Post Systems](https://reader031.vdocuments.site/reader031/viewer/2022022503/5ab0f4587f8b9a6b468be793/html5/thumbnails/53.jpg)
∗ Digression: Induction
Consider the following inductive definition of a subset M of Nat , thenatural numbers, using integer multiplication ·:
I 1 ∈M,
I if n ∈M, then 9 ·n ∈M,
I if n ∈M, then 23 ·n ∈M.
Suppose we define the function g : M→ Nat inductively by:
I g(1) = 1,
I g(9 ·n) = 9,
I g(23 ·n) = 23.
Prove that 0=1!
![Page 54: Doctor Syntax was the fictitious hero of several 19th ...ryan/cse4250/notes/syntax.pdf · Doctor Syntax was the fictitious hero of several 19th century satirical ... Post Systems](https://reader031.vdocuments.site/reader031/viewer/2022022503/5ab0f4587f8b9a6b468be793/html5/thumbnails/54.jpg)
More Regular Expressions
To make regular expressions more convenient, regular expressionsare almost always extended with new notation. Here are someadditional meta-symbols commonly seen (perhaps with differentsyntax). Regular expressions with these new meta-symbols can bedefined in terms of the original definitions. Let r be a regularexpression over the alphabet Σ.
1. Optional. r? = (r + /0∗)
2. One or more. (This is a second and conflicting use of themeta-character +.) r+ = (r · r∗)
3. Any. . = (a + b + . . .+ y + z) where Σ = a,b, . . . ,y ,z.
4. Range. [a− z] = (a + b + . . .+ y + z). (Assumes that Σ isordered.)
5. Range complement. [¬c− x] = (a + b + y + z). (Assumes thatΣ is ordered.)
![Page 55: Doctor Syntax was the fictitious hero of several 19th ...ryan/cse4250/notes/syntax.pdf · Doctor Syntax was the fictitious hero of several 19th century satirical ... Post Systems](https://reader031.vdocuments.site/reader031/viewer/2022022503/5ab0f4587f8b9a6b468be793/html5/thumbnails/55.jpg)
Where areregular expressions
used?
![Page 56: Doctor Syntax was the fictitious hero of several 19th ...ryan/cse4250/notes/syntax.pdf · Doctor Syntax was the fictitious hero of several 19th century satirical ... Post Systems](https://reader031.vdocuments.site/reader031/viewer/2022022503/5ab0f4587f8b9a6b468be793/html5/thumbnails/56.jpg)
xkcd—a webcomic of romance, sarcasm, math, and language byRandall Munroe
![Page 57: Doctor Syntax was the fictitious hero of several 19th ...ryan/cse4250/notes/syntax.pdf · Doctor Syntax was the fictitious hero of several 19th century satirical ... Post Systems](https://reader031.vdocuments.site/reader031/viewer/2022022503/5ab0f4587f8b9a6b468be793/html5/thumbnails/57.jpg)
Grep
The program grep is a command-line utility for searching textoriginally written for Unix. The grep command searches text files forlines matching a given regular expression and prints matching lines tothe programs standard output.
![Page 58: Doctor Syntax was the fictitious hero of several 19th ...ryan/cse4250/notes/syntax.pdf · Doctor Syntax was the fictitious hero of several 19th century satirical ... Post Systems](https://reader031.vdocuments.site/reader031/viewer/2022022503/5ab0f4587f8b9a6b468be793/html5/thumbnails/58.jpg)
Grep
Suppose the file preamble.txt has the following lines:
We the People of the United States, in Order to form a more perfectUnion, establish Justice, insure domestic Tranquility, provide for thecommon defence, promote the general Welfare, and secure the Blessingsof Liberty to ourselves and our Posterity, do ordain and establishthis Constitution for the United States of America.
> grep and preamble.txtcommon defence, promote the general Welfare, and secure the Blessingsof Liberty to ourselves and our Posterity, do ordain and establish
> grep -i people preamble.txtWe the People of the United States, in Order to form a more perfect
![Page 59: Doctor Syntax was the fictitious hero of several 19th ...ryan/cse4250/notes/syntax.pdf · Doctor Syntax was the fictitious hero of several 19th century satirical ... Post Systems](https://reader031.vdocuments.site/reader031/viewer/2022022503/5ab0f4587f8b9a6b468be793/html5/thumbnails/59.jpg)
> grep -w do preamble.txtof Liberty to ourselves and our Posterity, do ordain and establish
(But does not match “domestic.”)
> grep -E -w "(defense)|(defence)" preamble.txtcommon defence, promote the general Welfare, and secure the Blessings
![Page 60: Doctor Syntax was the fictitious hero of several 19th ...ryan/cse4250/notes/syntax.pdf · Doctor Syntax was the fictitious hero of several 19th century satirical ... Post Systems](https://reader031.vdocuments.site/reader031/viewer/2022022503/5ab0f4587f8b9a6b468be793/html5/thumbnails/60.jpg)
Practical Regular Expressions
grep options regexp filename
Options:
-E extended regular expression-i ignore case-x match whole line-w match word in line
Syntax of regular expressions for grep
| or $ end of line. any ? optional[] set {n,m} n through m times[ˆ ] set compliment () grouping
![Page 61: Doctor Syntax was the fictitious hero of several 19th ...ryan/cse4250/notes/syntax.pdf · Doctor Syntax was the fictitious hero of several 19th century satirical ... Post Systems](https://reader031.vdocuments.site/reader031/viewer/2022022503/5ab0f4587f8b9a6b468be793/html5/thumbnails/61.jpg)
grep
grep -E ’.{5,}’ /usr/dict/words
AarhusAaronAbabaabackabacus...zombiezoologyZoroastrianzucchiniZurichzygote
![Page 62: Doctor Syntax was the fictitious hero of several 19th ...ryan/cse4250/notes/syntax.pdf · Doctor Syntax was the fictitious hero of several 19th century satirical ... Post Systems](https://reader031.vdocuments.site/reader031/viewer/2022022503/5ab0f4587f8b9a6b468be793/html5/thumbnails/62.jpg)
grep
grep -E ’.{5,}’ /usr/dict/words
AarhusAaronAbabaabackabacus...zombiezoologyZoroastrianzucchiniZurichzygote
![Page 63: Doctor Syntax was the fictitious hero of several 19th ...ryan/cse4250/notes/syntax.pdf · Doctor Syntax was the fictitious hero of several 19th century satirical ... Post Systems](https://reader031.vdocuments.site/reader031/viewer/2022022503/5ab0f4587f8b9a6b468be793/html5/thumbnails/63.jpg)
grep
grep -i -E ’[aeiou]{4,}’ /usr/dict/words
aqueousHawaiianIEEEobsequiousonomatopoeiapharmacopoeiaprosopopoeiaqueueSequoia
![Page 64: Doctor Syntax was the fictitious hero of several 19th ...ryan/cse4250/notes/syntax.pdf · Doctor Syntax was the fictitious hero of several 19th century satirical ... Post Systems](https://reader031.vdocuments.site/reader031/viewer/2022022503/5ab0f4587f8b9a6b468be793/html5/thumbnails/64.jpg)
grep
grep -i -E ’[aeiou]{4,}’ /usr/dict/words
aqueousHawaiianIEEEobsequiousonomatopoeiapharmacopoeiaprosopopoeiaqueueSequoia
![Page 65: Doctor Syntax was the fictitious hero of several 19th ...ryan/cse4250/notes/syntax.pdf · Doctor Syntax was the fictitious hero of several 19th century satirical ... Post Systems](https://reader031.vdocuments.site/reader031/viewer/2022022503/5ab0f4587f8b9a6b468be793/html5/thumbnails/65.jpg)
grep
grep -i -E ’(q[ˆu]|q$)’ /usr/dict/words
CEQColloqIQIraqqQatarQEDq’sseq
![Page 66: Doctor Syntax was the fictitious hero of several 19th ...ryan/cse4250/notes/syntax.pdf · Doctor Syntax was the fictitious hero of several 19th century satirical ... Post Systems](https://reader031.vdocuments.site/reader031/viewer/2022022503/5ab0f4587f8b9a6b468be793/html5/thumbnails/66.jpg)
grep
grep -i -E ’(q[ˆu]|q$)’ /usr/dict/words
CEQColloqIQIraqqQatarQEDq’sseq
![Page 67: Doctor Syntax was the fictitious hero of several 19th ...ryan/cse4250/notes/syntax.pdf · Doctor Syntax was the fictitious hero of several 19th century satirical ... Post Systems](https://reader031.vdocuments.site/reader031/viewer/2022022503/5ab0f4587f8b9a6b468be793/html5/thumbnails/67.jpg)
grepIs there a regular expression that matches those words whose lettersappear in alphabetical order?
grep -x -E /usr/dict/words
grep -x -E’a?b?c?d?e?f?g?h?i?j?k?l?m?n?o?p?q?r?s?t?u?v?w?x?y?z?’/usr/dict/words
almostbeginbelowbiopsydirtyemptyfirstfortyghostglory
(Some of the longer words.)
![Page 68: Doctor Syntax was the fictitious hero of several 19th ...ryan/cse4250/notes/syntax.pdf · Doctor Syntax was the fictitious hero of several 19th century satirical ... Post Systems](https://reader031.vdocuments.site/reader031/viewer/2022022503/5ab0f4587f8b9a6b468be793/html5/thumbnails/68.jpg)
grepIs there a regular expression that matches those words whose lettersappear in alphabetical order?
grep -x -E /usr/dict/words
grep -x -E’a?b?c?d?e?f?g?h?i?j?k?l?m?n?o?p?q?r?s?t?u?v?w?x?y?z?’/usr/dict/words
almostbeginbelowbiopsydirtyemptyfirstfortyghostglory
(Some of the longer words.)
![Page 69: Doctor Syntax was the fictitious hero of several 19th ...ryan/cse4250/notes/syntax.pdf · Doctor Syntax was the fictitious hero of several 19th century satirical ... Post Systems](https://reader031.vdocuments.site/reader031/viewer/2022022503/5ab0f4587f8b9a6b468be793/html5/thumbnails/69.jpg)
grepIs there a regular expression that matches those words whose lettersappear in alphabetical order?
grep -x -E /usr/dict/words
grep -x -E’a?b?c?d?e?f?g?h?i?j?k?l?m?n?o?p?q?r?s?t?u?v?w?x?y?z?’/usr/dict/words
almostbeginbelowbiopsydirtyemptyfirstfortyghostglory
(Some of the longer words.)
![Page 70: Doctor Syntax was the fictitious hero of several 19th ...ryan/cse4250/notes/syntax.pdf · Doctor Syntax was the fictitious hero of several 19th century satirical ... Post Systems](https://reader031.vdocuments.site/reader031/viewer/2022022503/5ab0f4587f8b9a6b468be793/html5/thumbnails/70.jpg)
grepCan we modify the previous example to allow double letters?
grep -x -E’a*b*c*d*e*f*g*h*i*j*k*l*m*n*o*p*q*r*s*t*u*v*w*x*y*z*’/usr/dict/words
accentaccessalmostbiopsychillychoosyeffortfloppyglossyknotty
(Some of the longer words.)
![Page 71: Doctor Syntax was the fictitious hero of several 19th ...ryan/cse4250/notes/syntax.pdf · Doctor Syntax was the fictitious hero of several 19th century satirical ... Post Systems](https://reader031.vdocuments.site/reader031/viewer/2022022503/5ab0f4587f8b9a6b468be793/html5/thumbnails/71.jpg)
grep
grep -E -e ’(y.*){3,}’ /usr/dict/wordsA
The words with at least three y’s.
polytypy chromosomal variation between populationspsychophysiologysynonymysyzygy straight-line alignment of three celestial bodies
![Page 72: Doctor Syntax was the fictitious hero of several 19th ...ryan/cse4250/notes/syntax.pdf · Doctor Syntax was the fictitious hero of several 19th century satirical ... Post Systems](https://reader031.vdocuments.site/reader031/viewer/2022022503/5ab0f4587f8b9a6b468be793/html5/thumbnails/72.jpg)
grep
grep -E -e ’(y.*){3,}’ /usr/dict/wordsA
The words with at least three y’s.
polytypy chromosomal variation between populationspsychophysiologysynonymysyzygy straight-line alignment of three celestial bodies
![Page 73: Doctor Syntax was the fictitious hero of several 19th ...ryan/cse4250/notes/syntax.pdf · Doctor Syntax was the fictitious hero of several 19th century satirical ... Post Systems](https://reader031.vdocuments.site/reader031/viewer/2022022503/5ab0f4587f8b9a6b468be793/html5/thumbnails/73.jpg)
Back references
grep -E -e ’(.)\1\1’ /usr/dict/words
AAAAAASIEEEiiiviii
![Page 74: Doctor Syntax was the fictitious hero of several 19th ...ryan/cse4250/notes/syntax.pdf · Doctor Syntax was the fictitious hero of several 19th century satirical ... Post Systems](https://reader031.vdocuments.site/reader031/viewer/2022022503/5ab0f4587f8b9a6b468be793/html5/thumbnails/74.jpg)
Back references
grep -E -e ’(.)\1\1’ /usr/dict/words
AAAAAASIEEEiiiviii
![Page 75: Doctor Syntax was the fictitious hero of several 19th ...ryan/cse4250/notes/syntax.pdf · Doctor Syntax was the fictitious hero of several 19th century satirical ... Post Systems](https://reader031.vdocuments.site/reader031/viewer/2022022503/5ab0f4587f8b9a6b468be793/html5/thumbnails/75.jpg)
PERL
Perl supports the usual regular expressions:
\ Quote the next metacharacterˆ Match the beginning of the line. Match any character (except newline)$ Match the end of the line (or before newline at the end)| Alternation(..) Grouping[..] Character class
* Match 0 or more times+ Match 1 or more times? Match 1 or 0 times{n} Match exactly n times{n,} Match at least n times{n,m} Match at least n but not more than m times
![Page 76: Doctor Syntax was the fictitious hero of several 19th ...ryan/cse4250/notes/syntax.pdf · Doctor Syntax was the fictitious hero of several 19th century satirical ... Post Systems](https://reader031.vdocuments.site/reader031/viewer/2022022503/5ab0f4587f8b9a6b468be793/html5/thumbnails/76.jpg)
PERL
PERL supports “reluctant” and “possessive” quantifiers in addition tothe “greedy” quantifiers. PERL has additional boundary matchers,Unicode character class operators, and look-ahead and look-behindmatchers. Some of these require exponential back-tracking.
\p{property} Match a character with Unicode property\P{property} Match a character without Unicode property(?=pattern) A zero-width positive look-ahead assertion(?!pattern) A zero-width negative look-ahead assertion(?<=pattern) A zero-width positive look-behind assertion(?<!pattern) A zero-width negative look-behind assertion
For more information see the documentation at www.perldoc.com.
![Page 77: Doctor Syntax was the fictitious hero of several 19th ...ryan/cse4250/notes/syntax.pdf · Doctor Syntax was the fictitious hero of several 19th century satirical ... Post Systems](https://reader031.vdocuments.site/reader031/viewer/2022022503/5ab0f4587f8b9a6b468be793/html5/thumbnails/77.jpg)
Java
Java supports regular expression in the classjava.util.regex.Pattern. It has character class operations forunion and intersection.
[...] Character class[ˆ...] Complement[...[...]] Union[...&&[...]] Intersection[...&&[ˆ...]] Substraction
![Page 78: Doctor Syntax was the fictitious hero of several 19th ...ryan/cse4250/notes/syntax.pdf · Doctor Syntax was the fictitious hero of several 19th century satirical ... Post Systems](https://reader031.vdocuments.site/reader031/viewer/2022022503/5ab0f4587f8b9a6b468be793/html5/thumbnails/78.jpg)
// Greedy quantifiersString match = find("A.*c", "AbcAbc"); // AbcAbcmatch = find("A.+", "AbcAbc"); // AbcAbc
// Nongreedy quantifiersmatch = find("A.*?c", "AbcAbc"); // Abcmatch = find("A.+?", "AbcAbc"); // Ab
// Find first substring in input that matches pattern.static String find (String pat, CharSequence input) {
final Pattern pattern = Pattern.compile(pat);final Matcher matcher = pattern.matcher(input);if (matcher.find()) {
return matcher.group();}return null;
}
![Page 79: Doctor Syntax was the fictitious hero of several 19th ...ryan/cse4250/notes/syntax.pdf · Doctor Syntax was the fictitious hero of several 19th century satirical ... Post Systems](https://reader031.vdocuments.site/reader031/viewer/2022022503/5ab0f4587f8b9a6b468be793/html5/thumbnails/79.jpg)
Using regular expressions to match comments often runs into twoproblems:
I The “every character” pattern often does not match thecharacters indicating a newline.
/* The first line,but not the second line. */
I The greedy “star” will match more than what was intended.
x = 1; /* Both comments */x = 2;/* at once! */ x = 3;
final static String commentMultilineString = "/\\*.*?\\*/";final static Pattern multiLineCommentPattern =
Pattern.compile (commentMultilineString , Pattern.DOTALL);
![Page 80: Doctor Syntax was the fictitious hero of several 19th ...ryan/cse4250/notes/syntax.pdf · Doctor Syntax was the fictitious hero of several 19th century satirical ... Post Systems](https://reader031.vdocuments.site/reader031/viewer/2022022503/5ab0f4587f8b9a6b468be793/html5/thumbnails/80.jpg)
Tokens
Regular expressions are used in the definition of programminglanguages to describe the basic lexical units–tokens.
1. NUMBER: [0−9]+
2. IDENTIFIER: [a− zA−Z ][a− zA−Z0−9]∗
3. a keyword: begin
4. a hex number:0x[0−9A−F ]+
5. REAL: [0−9] +′ .′[0−9] + (E[0−9]+)?
Metacharacters: [] +′−()?|
![Page 81: Doctor Syntax was the fictitious hero of several 19th ...ryan/cse4250/notes/syntax.pdf · Doctor Syntax was the fictitious hero of several 19th century satirical ... Post Systems](https://reader031.vdocuments.site/reader031/viewer/2022022503/5ab0f4587f8b9a6b468be793/html5/thumbnails/81.jpg)
Tools
![Page 82: Doctor Syntax was the fictitious hero of several 19th ...ryan/cse4250/notes/syntax.pdf · Doctor Syntax was the fictitious hero of several 19th century satirical ... Post Systems](https://reader031.vdocuments.site/reader031/viewer/2022022503/5ab0f4587f8b9a6b468be793/html5/thumbnails/82.jpg)
Token Stream
![Page 83: Doctor Syntax was the fictitious hero of several 19th ...ryan/cse4250/notes/syntax.pdf · Doctor Syntax was the fictitious hero of several 19th century satirical ... Post Systems](https://reader031.vdocuments.site/reader031/viewer/2022022503/5ab0f4587f8b9a6b468be793/html5/thumbnails/83.jpg)
Token Stream
Given a program
float match0 (char *s) /* find a zero */{ if (!strncmp (s, "0.0", 3))return 0.;
}
the lexical analyzer will return the streamFLOAT ID(MATCH0) LPAREN CHAR STAR ID(S) RPAREN LBRACE
BANG ID(STRNCMP) LPAREN ID(S) COMMA STRING(0.0)COMMA NUM(3) RPAREN RPAREN RETURN REAL(0.0) SEMI
RBRACE EOF
![Page 84: Doctor Syntax was the fictitious hero of several 19th ...ryan/cse4250/notes/syntax.pdf · Doctor Syntax was the fictitious hero of several 19th century satirical ... Post Systems](https://reader031.vdocuments.site/reader031/viewer/2022022503/5ab0f4587f8b9a6b468be793/html5/thumbnails/84.jpg)
Lexical Tokens
An identifier is a sequence of letters and digits; the firstcharacter must be a letter. The underscore counts as aletter. Upper- and lowercase letters are different. If the inputstream has been parsed into tokens up to a given character,the next token is taken to include the longest string ofcharacters that could possibly constitute a token. Blanks,tabs, newlines, and comments are ignored except as theyserve to separate tokens. Some white space is required toseparate otherwise adjacent identifiers, keywords, andconstants.
![Page 85: Doctor Syntax was the fictitious hero of several 19th ...ryan/cse4250/notes/syntax.pdf · Doctor Syntax was the fictitious hero of several 19th century satirical ... Post Systems](https://reader031.vdocuments.site/reader031/viewer/2022022503/5ab0f4587f8b9a6b468be793/html5/thumbnails/85.jpg)
Regular expressions
1. How do you recognize if a string matches a regular expression?
2. How do you tokenize a character string?
These questions are left to a compiler construction class.
![Page 86: Doctor Syntax was the fictitious hero of several 19th ...ryan/cse4250/notes/syntax.pdf · Doctor Syntax was the fictitious hero of several 19th century satirical ... Post Systems](https://reader031.vdocuments.site/reader031/viewer/2022022503/5ab0f4587f8b9a6b468be793/html5/thumbnails/86.jpg)
Expressiveness of RE
We like RE because they describe strings.RE are great: easily understandable, concise, recognizers are easy tomake, . . .But,
The major shortcoming of regular expressions, as far as thedescription of programming languages is concerned, is that bracketingis not expressible. Bracketing is indispensable. For example,arithmetic expressions require opening and closing parentheses.Sequences of statements are often bracketed by begin/end pairs.
The set of strings
{an ·bn}= {ε,ab,aabb,aaabbb, . . .}
is not regular!
![Page 87: Doctor Syntax was the fictitious hero of several 19th ...ryan/cse4250/notes/syntax.pdf · Doctor Syntax was the fictitious hero of several 19th century satirical ... Post Systems](https://reader031.vdocuments.site/reader031/viewer/2022022503/5ab0f4587f8b9a6b468be793/html5/thumbnails/87.jpg)
Expressiveness of RE
We like RE because they describe strings.RE are great: easily understandable, concise, recognizers are easy tomake, . . .But,The major shortcoming of regular expressions, as far as thedescription of programming languages is concerned, is that bracketingis not expressible. Bracketing is indispensable. For example,arithmetic expressions require opening and closing parentheses.Sequences of statements are often bracketed by begin/end pairs.
The set of strings
{an ·bn}= {ε,ab,aabb,aaabbb, . . .}
is not regular!
![Page 88: Doctor Syntax was the fictitious hero of several 19th ...ryan/cse4250/notes/syntax.pdf · Doctor Syntax was the fictitious hero of several 19th century satirical ... Post Systems](https://reader031.vdocuments.site/reader031/viewer/2022022503/5ab0f4587f8b9a6b468be793/html5/thumbnails/88.jpg)
BNF (Backus-Naur Form)Conventions (e.g., tokens in bold font) and meta-symbols:
::= “is defined as”| “or”[] “optional” (not necessary, EBNF){} “zero or more” (not necessary, EBNF)
BNF is a collection of rules or productions. Each rule has a LHS and aRHS. The LHS is a syntactic category (a part of the language), calleda nonterminal and the RHS is a sequence of nonterminals and tokens(called terminals).
LHS1 ::= RHS1
LHS2 ::= RHS2...
LHSn ::= RHSn
![Page 89: Doctor Syntax was the fictitious hero of several 19th ...ryan/cse4250/notes/syntax.pdf · Doctor Syntax was the fictitious hero of several 19th century satirical ... Post Systems](https://reader031.vdocuments.site/reader031/viewer/2022022503/5ab0f4587f8b9a6b468be793/html5/thumbnails/89.jpg)
Examples and Derivations
Grammar 1 and derivation: “revolutionary ideas”Example 3.1, page 129. Program with statement listsExample 3.2, page 131. Simple assignment statementsSebesta, seventh edition
![Page 90: Doctor Syntax was the fictitious hero of several 19th ...ryan/cse4250/notes/syntax.pdf · Doctor Syntax was the fictitious hero of several 19th century satirical ... Post Systems](https://reader031.vdocuments.site/reader031/viewer/2022022503/5ab0f4587f8b9a6b468be793/html5/thumbnails/90.jpg)
Example Grammar
sentence ::= subject predicate .
subject ::= article {adjective} noun
predicate ::= verb [adverb ]
article ::= A | The
adjective ::= green | colorless | big | new | revolutionary
noun ::= caterpillar | idea
verb ::= sleeps | appears | crawls
adverb ::= furiously | infrequently
![Page 91: Doctor Syntax was the fictitious hero of several 19th ...ryan/cse4250/notes/syntax.pdf · Doctor Syntax was the fictitious hero of several 19th century satirical ... Post Systems](https://reader031.vdocuments.site/reader031/viewer/2022022503/5ab0f4587f8b9a6b468be793/html5/thumbnails/91.jpg)
Derivation
Sentential form. A sentential form is a string of terminal andnonterminal symbols.
Derivation. A derivation of a sentential form is created from anonterminal by repeatedly replacing the LHS nonterminal of some rulewith its RHS.Every BNF definition is a description of a formal language (a set ofstring), the language of all sentences derivable from a syntacticcategory.
We derive two sentences used as example by Noam Chomsky—onesentence is meaningful, the other makes no sense.
![Page 92: Doctor Syntax was the fictitious hero of several 19th ...ryan/cse4250/notes/syntax.pdf · Doctor Syntax was the fictitious hero of several 19th century satirical ... Post Systems](https://reader031.vdocuments.site/reader031/viewer/2022022503/5ab0f4587f8b9a6b468be793/html5/thumbnails/92.jpg)
Example Derivation
sentence⇒1 subject predicate .⇒2 article adjective adjective noun predicate .⇒3 Aadjective adjective noun predicate .⇒4 Anewadjective noun predicate .⇒5 Anewrevolutionarynoun predicate .⇒6 Anewrevolutionary ideapredicate .⇒7 Anewrevolutionary ideaverb adverb .⇒8 Anewrevolutionary ideaappearsadverb .⇒9 Anewrevolutionary ideaappears infrequently .
![Page 93: Doctor Syntax was the fictitious hero of several 19th ...ryan/cse4250/notes/syntax.pdf · Doctor Syntax was the fictitious hero of several 19th century satirical ... Post Systems](https://reader031.vdocuments.site/reader031/viewer/2022022503/5ab0f4587f8b9a6b468be793/html5/thumbnails/93.jpg)
Example Derivation
sentence⇒1 subject predicate .⇒2 article adjective adjective noun predicate .⇒3 Aadjective adjective noun predicate .⇒4 Acolorlessadjective noun predicate .⇒5 Acolorlessgreennoun predicate .⇒6 Acolorlessgreenideapredicate .⇒7 Acolorlessgreenideaverb adverb .⇒8 Acolorlessgreenideasleepsadverb .⇒9 Acolorlessgreenideasleepsfuriously .
![Page 94: Doctor Syntax was the fictitious hero of several 19th ...ryan/cse4250/notes/syntax.pdf · Doctor Syntax was the fictitious hero of several 19th century satirical ... Post Systems](https://reader031.vdocuments.site/reader031/viewer/2022022503/5ab0f4587f8b9a6b468be793/html5/thumbnails/94.jpg)
Syntax versus Semantics
Two sentences (with the same structure, syntax) derived from thegrammar.
Anewrevolutionary ideaappears infrequently .Acolorlessgreenideasleepsfuriously .
One sentence is meaningful, the other is meaningless. Clearly, themeaning of the words (semantics) is important.
![Page 95: Doctor Syntax was the fictitious hero of several 19th ...ryan/cse4250/notes/syntax.pdf · Doctor Syntax was the fictitious hero of several 19th century satirical ... Post Systems](https://reader031.vdocuments.site/reader031/viewer/2022022503/5ab0f4587f8b9a6b468be793/html5/thumbnails/95.jpg)
Examples and Derivations
∗ Grammar 1 and derivation: “revolutionary ideas”Example 3.1, page 112. Program with statement listsExample 3.2, page 113. Simple assignment statementsSebesta, fifth edition
![Page 96: Doctor Syntax was the fictitious hero of several 19th ...ryan/cse4250/notes/syntax.pdf · Doctor Syntax was the fictitious hero of several 19th century satirical ... Post Systems](https://reader031.vdocuments.site/reader031/viewer/2022022503/5ab0f4587f8b9a6b468be793/html5/thumbnails/96.jpg)
Alternate Styles of BNF
“traditional” style:〈traditional style〉 ::= keyNonterminals in angle brackets, terminals in bold font.
CFG or “mathematical” style:A→ BaNonterminals in uppercase letters, terminal in lower case letters.
“verbatim” style:
verbatim_style ::= "key"
ISO 14977 is much like verbatim style.
Wirth, “What can we do about the unnecessary diversity of notation forsyntactic definitions?” CACM, volume 20, number 11, November 1977,page 822.
![Page 97: Doctor Syntax was the fictitious hero of several 19th ...ryan/cse4250/notes/syntax.pdf · Doctor Syntax was the fictitious hero of several 19th century satirical ... Post Systems](https://reader031.vdocuments.site/reader031/viewer/2022022503/5ab0f4587f8b9a6b468be793/html5/thumbnails/97.jpg)
Syntax Graphs
In addition to linear forms of BNF there are also syntax graphs, alsoknown as syntax diagrams.See Chapter 2, page 21 of WebberFigure 3.6, page 122 of Sebesta, fifth edition (not in sixth edition).See Hyper GOS syntax diagram generator.
![Page 98: Doctor Syntax was the fictitious hero of several 19th ...ryan/cse4250/notes/syntax.pdf · Doctor Syntax was the fictitious hero of several 19th century satirical ... Post Systems](https://reader031.vdocuments.site/reader031/viewer/2022022503/5ab0f4587f8b9a6b468be793/html5/thumbnails/98.jpg)
Syntax Graphs
if_statement ::="if" condition "then"sequence_of_statements{ "elsif" condition "then"sequence_of_statements }[ "else"sequence_of_statements ]"end" "if" ";"
![Page 99: Doctor Syntax was the fictitious hero of several 19th ...ryan/cse4250/notes/syntax.pdf · Doctor Syntax was the fictitious hero of several 19th century satirical ... Post Systems](https://reader031.vdocuments.site/reader031/viewer/2022022503/5ab0f4587f8b9a6b468be793/html5/thumbnails/99.jpg)
∗Grammar
A grammar is a 4-tuple 〈T ,N,P,S〉I T is the set of terminal symbols,
I N set of nonterminal symbols, T ∩N = /0,
I S, a nonterminal, is the start symbol,
I P are the productions of the grammar.
A production has the form α→ β where α and β are strings ofterminals and nonterminals (α can’t be the empty string). We areprimarily interested in context-free grammars in which α is a singlenonterminal.
![Page 100: Doctor Syntax was the fictitious hero of several 19th ...ryan/cse4250/notes/syntax.pdf · Doctor Syntax was the fictitious hero of several 19th century satirical ... Post Systems](https://reader031.vdocuments.site/reader031/viewer/2022022503/5ab0f4587f8b9a6b468be793/html5/thumbnails/100.jpg)
∗Example 1
Let G be the grammar defined by the following productions.
S → AB
A → ε
A → Aa
B → ε
B → B b
(We infer easily that T , the set of terminals, is {a,b}; N, the set ofnonterminals, is {A,B}; and S is the start symbol.)
LG = {ai bj}
![Page 101: Doctor Syntax was the fictitious hero of several 19th ...ryan/cse4250/notes/syntax.pdf · Doctor Syntax was the fictitious hero of several 19th century satirical ... Post Systems](https://reader031.vdocuments.site/reader031/viewer/2022022503/5ab0f4587f8b9a6b468be793/html5/thumbnails/101.jpg)
∗Example 2
Let G be the grammar defined by the following productions.
S → ε
S → aS b
(We infer easily that T , the set of terminals, is {a,b}; N, the set ofnonterminals, is {S}; and S is the start symbol.)
LG = {ai bi}
![Page 102: Doctor Syntax was the fictitious hero of several 19th ...ryan/cse4250/notes/syntax.pdf · Doctor Syntax was the fictitious hero of several 19th century satirical ... Post Systems](https://reader031.vdocuments.site/reader031/viewer/2022022503/5ab0f4587f8b9a6b468be793/html5/thumbnails/102.jpg)
∗What’s a Grammar?
A grammar is a description of a language.Every grammar denotes a formal language (set of strings). A string ofnonterminals ω is in the language, if it is derivable from the start thesymbol.We say γαδ⇒ γβδ (derives in one step) whenever α→ β is aproduction. We use α
∗⇒β as the reflexive and transitive closure of the⇒ relation. A string ω of terminals is in the language defined bygrammar G, if S
∗⇒ω. So, L(G) = {ω ∈ T ∗ | S ∗⇒ω}. In this case thestring ω is called a sentence of G. If S
∗⇒α where α may containnonterminals, then we say α is a sentential form of the grammar G.
![Page 103: Doctor Syntax was the fictitious hero of several 19th ...ryan/cse4250/notes/syntax.pdf · Doctor Syntax was the fictitious hero of several 19th century satirical ... Post Systems](https://reader031.vdocuments.site/reader031/viewer/2022022503/5ab0f4587f8b9a6b468be793/html5/thumbnails/103.jpg)
∗Chomsky Hierarchy
name grammar
0 unrestricted1 context-sensitive |β| ≥ |α|2 context-free α ∈ N3 regular A→ ωB, A→ ω
![Page 104: Doctor Syntax was the fictitious hero of several 19th ...ryan/cse4250/notes/syntax.pdf · Doctor Syntax was the fictitious hero of several 19th century satirical ... Post Systems](https://reader031.vdocuments.site/reader031/viewer/2022022503/5ab0f4587f8b9a6b468be793/html5/thumbnails/104.jpg)
Same as BNF
Context-free grammar is the same as BNF.
S ::= AB
A ::=
A ::= Aa
B ::=
B ::= B b
OR
S ::= AB
A ::= Aa | εB ::= B b | ε
![Page 105: Doctor Syntax was the fictitious hero of several 19th ...ryan/cse4250/notes/syntax.pdf · Doctor Syntax was the fictitious hero of several 19th century satirical ... Post Systems](https://reader031.vdocuments.site/reader031/viewer/2022022503/5ab0f4587f8b9a6b468be793/html5/thumbnails/105.jpg)
Parse Tree
A formal definition of a parse tree for a context-free grammar ispossible, but it is not illuminating. But the trees in the figures aresuggestive. Each production used in the derivation of a string appearsas a subtree in the diagram. The left-hand-side nonterminal appearsas a node, and all the grammar symbols in the right-hand side of theproduction appear as children of this node.
![Page 106: Doctor Syntax was the fictitious hero of several 19th ...ryan/cse4250/notes/syntax.pdf · Doctor Syntax was the fictitious hero of several 19th century satirical ... Post Systems](https://reader031.vdocuments.site/reader031/viewer/2022022503/5ab0f4587f8b9a6b468be793/html5/thumbnails/106.jpg)
An example form Sebesta, 7th, Example 3.2, page 131.An grammar for simple assignment statement
assign ::= id := expr
expr ::= expr + expr
| expr ∗ expr
| (expr )
| id
id ::= A | B | C
![Page 107: Doctor Syntax was the fictitious hero of several 19th ...ryan/cse4250/notes/syntax.pdf · Doctor Syntax was the fictitious hero of several 19th century satirical ... Post Systems](https://reader031.vdocuments.site/reader031/viewer/2022022503/5ab0f4587f8b9a6b468be793/html5/thumbnails/107.jpg)
Example Derivation
assign⇒
id :=expr ⇒A :=expr ⇒A := id ∗expr ⇒A :=B∗expr ⇒A :=B∗(expr )⇒A :=B∗( id +expr )⇒A :=B∗(A+expr )⇒A :=B∗(A+ id )⇒A :=B∗(A+C)
![Page 108: Doctor Syntax was the fictitious hero of several 19th ...ryan/cse4250/notes/syntax.pdf · Doctor Syntax was the fictitious hero of several 19th century satirical ... Post Systems](https://reader031.vdocuments.site/reader031/viewer/2022022503/5ab0f4587f8b9a6b468be793/html5/thumbnails/108.jpg)
Example Derivation
assign⇒id :=expr ⇒
A :=expr ⇒A := id ∗expr ⇒A :=B∗expr ⇒A :=B∗(expr )⇒A :=B∗( id +expr )⇒A :=B∗(A+expr )⇒A :=B∗(A+ id )⇒A :=B∗(A+C)
![Page 109: Doctor Syntax was the fictitious hero of several 19th ...ryan/cse4250/notes/syntax.pdf · Doctor Syntax was the fictitious hero of several 19th century satirical ... Post Systems](https://reader031.vdocuments.site/reader031/viewer/2022022503/5ab0f4587f8b9a6b468be793/html5/thumbnails/109.jpg)
Example Derivation
assign⇒id :=expr ⇒A :=expr ⇒
A := id ∗expr ⇒A :=B∗expr ⇒A :=B∗(expr )⇒A :=B∗( id +expr )⇒A :=B∗(A+expr )⇒A :=B∗(A+ id )⇒A :=B∗(A+C)
![Page 110: Doctor Syntax was the fictitious hero of several 19th ...ryan/cse4250/notes/syntax.pdf · Doctor Syntax was the fictitious hero of several 19th century satirical ... Post Systems](https://reader031.vdocuments.site/reader031/viewer/2022022503/5ab0f4587f8b9a6b468be793/html5/thumbnails/110.jpg)
Example Derivation
assign⇒id :=expr ⇒A :=expr ⇒A := id ∗expr ⇒
A :=B∗expr ⇒A :=B∗(expr )⇒A :=B∗( id +expr )⇒A :=B∗(A+expr )⇒A :=B∗(A+ id )⇒A :=B∗(A+C)
![Page 111: Doctor Syntax was the fictitious hero of several 19th ...ryan/cse4250/notes/syntax.pdf · Doctor Syntax was the fictitious hero of several 19th century satirical ... Post Systems](https://reader031.vdocuments.site/reader031/viewer/2022022503/5ab0f4587f8b9a6b468be793/html5/thumbnails/111.jpg)
Example Derivation
assign⇒id :=expr ⇒A :=expr ⇒A := id ∗expr ⇒A :=B∗expr ⇒
A :=B∗(expr )⇒A :=B∗( id +expr )⇒A :=B∗(A+expr )⇒A :=B∗(A+ id )⇒A :=B∗(A+C)
![Page 112: Doctor Syntax was the fictitious hero of several 19th ...ryan/cse4250/notes/syntax.pdf · Doctor Syntax was the fictitious hero of several 19th century satirical ... Post Systems](https://reader031.vdocuments.site/reader031/viewer/2022022503/5ab0f4587f8b9a6b468be793/html5/thumbnails/112.jpg)
Example Derivation
assign⇒id :=expr ⇒A :=expr ⇒A := id ∗expr ⇒A :=B∗expr ⇒A :=B∗(expr )⇒
A :=B∗( id +expr )⇒A :=B∗(A+expr )⇒A :=B∗(A+ id )⇒A :=B∗(A+C)
![Page 113: Doctor Syntax was the fictitious hero of several 19th ...ryan/cse4250/notes/syntax.pdf · Doctor Syntax was the fictitious hero of several 19th century satirical ... Post Systems](https://reader031.vdocuments.site/reader031/viewer/2022022503/5ab0f4587f8b9a6b468be793/html5/thumbnails/113.jpg)
Example Derivation
assign⇒id :=expr ⇒A :=expr ⇒A := id ∗expr ⇒A :=B∗expr ⇒A :=B∗(expr )⇒A :=B∗( id +expr )⇒
A :=B∗(A+expr )⇒A :=B∗(A+ id )⇒A :=B∗(A+C)
![Page 114: Doctor Syntax was the fictitious hero of several 19th ...ryan/cse4250/notes/syntax.pdf · Doctor Syntax was the fictitious hero of several 19th century satirical ... Post Systems](https://reader031.vdocuments.site/reader031/viewer/2022022503/5ab0f4587f8b9a6b468be793/html5/thumbnails/114.jpg)
Example Derivation
assign⇒id :=expr ⇒A :=expr ⇒A := id ∗expr ⇒A :=B∗expr ⇒A :=B∗(expr )⇒A :=B∗( id +expr )⇒A :=B∗(A+expr )⇒
A :=B∗(A+ id )⇒A :=B∗(A+C)
![Page 115: Doctor Syntax was the fictitious hero of several 19th ...ryan/cse4250/notes/syntax.pdf · Doctor Syntax was the fictitious hero of several 19th century satirical ... Post Systems](https://reader031.vdocuments.site/reader031/viewer/2022022503/5ab0f4587f8b9a6b468be793/html5/thumbnails/115.jpg)
Example Derivation
assign⇒id :=expr ⇒A :=expr ⇒A := id ∗expr ⇒A :=B∗expr ⇒A :=B∗(expr )⇒A :=B∗( id +expr )⇒A :=B∗(A+expr )⇒A :=B∗(A+ id )⇒
A :=B∗(A+C)
![Page 116: Doctor Syntax was the fictitious hero of several 19th ...ryan/cse4250/notes/syntax.pdf · Doctor Syntax was the fictitious hero of several 19th century satirical ... Post Systems](https://reader031.vdocuments.site/reader031/viewer/2022022503/5ab0f4587f8b9a6b468be793/html5/thumbnails/116.jpg)
Example Derivation
assign⇒id :=expr ⇒A :=expr ⇒A := id ∗expr ⇒A :=B∗expr ⇒A :=B∗(expr )⇒A :=B∗( id +expr )⇒A :=B∗(A+expr )⇒A :=B∗(A+ id )⇒A :=B∗(A+C)
![Page 117: Doctor Syntax was the fictitious hero of several 19th ...ryan/cse4250/notes/syntax.pdf · Doctor Syntax was the fictitious hero of several 19th century satirical ... Post Systems](https://reader031.vdocuments.site/reader031/viewer/2022022503/5ab0f4587f8b9a6b468be793/html5/thumbnails/117.jpg)
Abstract Syntax Tree
Concrete parse trees may have a lot of extra stuff and inconvenient touse directly.(Webber, Section 3.8 Abstract Syntax Trees, page 38.)
![Page 118: Doctor Syntax was the fictitious hero of several 19th ...ryan/cse4250/notes/syntax.pdf · Doctor Syntax was the fictitious hero of several 19th century satirical ... Post Systems](https://reader031.vdocuments.site/reader031/viewer/2022022503/5ab0f4587f8b9a6b468be793/html5/thumbnails/118.jpg)
Example Derivation
assign⇒id := expr ⇒A := expr ⇒A := id ∗ expr ⇒A := B ∗ expr ⇒A := B ∗ (expr )⇒A := B ∗ ( id +expr )⇒A := B ∗ (A+expr )⇒A := B ∗ (A+ id )⇒A := B ∗ (A+C)
![Page 119: Doctor Syntax was the fictitious hero of several 19th ...ryan/cse4250/notes/syntax.pdf · Doctor Syntax was the fictitious hero of several 19th century satirical ... Post Systems](https://reader031.vdocuments.site/reader031/viewer/2022022503/5ab0f4587f8b9a6b468be793/html5/thumbnails/119.jpg)
Concrete Parse Tree
![Page 120: Doctor Syntax was the fictitious hero of several 19th ...ryan/cse4250/notes/syntax.pdf · Doctor Syntax was the fictitious hero of several 19th century satirical ... Post Systems](https://reader031.vdocuments.site/reader031/viewer/2022022503/5ab0f4587f8b9a6b468be793/html5/thumbnails/120.jpg)
Abstract Syntax Tree
A:=B*(A+C). expr can be replaced by the kind of expression +, *. idcontains no information. Parens are no longer needed.
![Page 121: Doctor Syntax was the fictitious hero of several 19th ...ryan/cse4250/notes/syntax.pdf · Doctor Syntax was the fictitious hero of several 19th century satirical ... Post Systems](https://reader031.vdocuments.site/reader031/viewer/2022022503/5ab0f4587f8b9a6b468be793/html5/thumbnails/121.jpg)
Parse Tree
If A:=(B*A)+C was the statement, then the abstract syntax would bedifferent and so no important information would be lost.
![Page 122: Doctor Syntax was the fictitious hero of several 19th ...ryan/cse4250/notes/syntax.pdf · Doctor Syntax was the fictitious hero of several 19th century satirical ... Post Systems](https://reader031.vdocuments.site/reader031/viewer/2022022503/5ab0f4587f8b9a6b468be793/html5/thumbnails/122.jpg)
Dangling else
An ambiguous grammar is one in which some sentence has more thanone parse tree. [Sebesta, 3.3.1.7 Ambiguity, page 132ff.]The grammar of Example 3.3 [Sebesta, 7th, page 132], simpleexpressions, is ambiguous.Also, this grammar of if/then/else statements is ambiguous.Sebesta, 7th, 3.3.1.10 An Unambiguous Grammar for if-then-else,page 137; Figure 3.5, page 138.
![Page 123: Doctor Syntax was the fictitious hero of several 19th ...ryan/cse4250/notes/syntax.pdf · Doctor Syntax was the fictitious hero of several 19th century satirical ... Post Systems](https://reader031.vdocuments.site/reader031/viewer/2022022503/5ab0f4587f8b9a6b468be793/html5/thumbnails/123.jpg)
Two Parse Trees
[Convert from PS tricks.]
![Page 124: Doctor Syntax was the fictitious hero of several 19th ...ryan/cse4250/notes/syntax.pdf · Doctor Syntax was the fictitious hero of several 19th century satirical ... Post Systems](https://reader031.vdocuments.site/reader031/viewer/2022022503/5ab0f4587f8b9a6b468be793/html5/thumbnails/124.jpg)
Equivalent Grammar
Just because a grammar is ambiguous does not mean that allgrammars for the same language are ambiguous.
S → if C then S | if C then S else S | S′
This grammar for the same language is not.
S → MS | UMS
MS → if C then MS else MS | S′
UMS → if C then S | if C then MS else UMS
![Page 125: Doctor Syntax was the fictitious hero of several 19th ...ryan/cse4250/notes/syntax.pdf · Doctor Syntax was the fictitious hero of several 19th century satirical ... Post Systems](https://reader031.vdocuments.site/reader031/viewer/2022022503/5ab0f4587f8b9a6b468be793/html5/thumbnails/125.jpg)
Alternative Language
If the “natural” grammar is ambiguous, could it mean that the constructis confusing to programmers? Is another approach possible?Many languages have redesigned the syntax of the if statement:
S → if C then S endif |if C then S else S endif |S′
The two statements with different structure now have different syntax.
if C1 then if C2 then S1 endif else S2 endifif C1 then if C2 then S1 else S2 endif endif
![Page 126: Doctor Syntax was the fictitious hero of several 19th ...ryan/cse4250/notes/syntax.pdf · Doctor Syntax was the fictitious hero of several 19th century satirical ... Post Systems](https://reader031.vdocuments.site/reader031/viewer/2022022503/5ab0f4587f8b9a6b468be793/html5/thumbnails/126.jpg)
Tools
![Page 127: Doctor Syntax was the fictitious hero of several 19th ...ryan/cse4250/notes/syntax.pdf · Doctor Syntax was the fictitious hero of several 19th century satirical ... Post Systems](https://reader031.vdocuments.site/reader031/viewer/2022022503/5ab0f4587f8b9a6b468be793/html5/thumbnails/127.jpg)
![Page 128: Doctor Syntax was the fictitious hero of several 19th ...ryan/cse4250/notes/syntax.pdf · Doctor Syntax was the fictitious hero of several 19th century satirical ... Post Systems](https://reader031.vdocuments.site/reader031/viewer/2022022503/5ab0f4587f8b9a6b468be793/html5/thumbnails/128.jpg)
Recursive Descent Parsers
Grammars in the right form permit a parser of simple recursivefunctions. Each nonterminal turns into a recursive function forming aset of mutually recursive functions. Each function does a case analysison the input to determine which production/RHS to follow. Terminals inthe RHS are matched (consumed) and nonterminals turned intorecursive calls.Java Applet demo: http://cswebsrv.cs.binghamton.edu/˜zdu/parsdemo/recframe.htmlhttp://www.uni-paderborn.de/fachbereich/AG/agkastens/compiler/parsdemo/recframe.htmlSee example in Figure 2.10 in Scott.
![Page 129: Doctor Syntax was the fictitious hero of several 19th ...ryan/cse4250/notes/syntax.pdf · Doctor Syntax was the fictitious hero of several 19th century satirical ... Post Systems](https://reader031.vdocuments.site/reader031/viewer/2022022503/5ab0f4587f8b9a6b468be793/html5/thumbnails/129.jpg)
Limitations of BNF/CFG
〈block〉 ::= 〈block identifier〉 :
begin 〈statement〉 {〈statement〉}end 〈block identifier〉 ;
〈call〉 ::= 〈identifier〉 ({〈expression〉} )
![Page 130: Doctor Syntax was the fictitious hero of several 19th ...ryan/cse4250/notes/syntax.pdf · Doctor Syntax was the fictitious hero of several 19th century satirical ... Post Systems](https://reader031.vdocuments.site/reader031/viewer/2022022503/5ab0f4587f8b9a6b468be793/html5/thumbnails/130.jpg)
Beyond BNF/CFG
Formalisms with greater expressive power
I unrestricted grammars
I Post systems
I attribute grammars
Such formalisms are used in describing semantics — a later topic.Notice the pragmatic definition of syntax (at or below context-free) andsemantics (beyond context-free).
The set of strings
{an ·bn · cn}= {ε,abc,aabbcc,aaabbbccc, . . .}
is not context free!
![Page 131: Doctor Syntax was the fictitious hero of several 19th ...ryan/cse4250/notes/syntax.pdf · Doctor Syntax was the fictitious hero of several 19th century satirical ... Post Systems](https://reader031.vdocuments.site/reader031/viewer/2022022503/5ab0f4587f8b9a6b468be793/html5/thumbnails/131.jpg)
programminglanguages
language design,description, semantics
compilers
implementation,recognition
formallanguages
expressivity