1 、 alphabet non-empty set of symbols , usually expressed in 、 v or other upper-case greece...
Post on 21-Dec-2015
222 views
TRANSCRIPT
1 、 Alphabet Non-empty set of symbols , usually expressed in 、 V or Other Upper-case Greece Letter2 、 Symbol(Character) Elements in alphabet, finest elements in a language3 、 String Finite sequence of symbols in the Alphabet. Notes : Null-string is string without any symbol, written as 。
Chapter 2 Language & Syntax DescriptionSection 1 Alphabet & String
Chapter 2 Language & Syntax DescriptionSection 1 Alphabet & String
4 、 Sentence A set of strings based on symbols in the Alphabet in certain construction rules5 、 Language Sets of sentences in the Alphabet. Notes : By convention, a symbol is expressed as a,b,c,… ; a string is expressed as ,,,… ;a set of strings is expressed in A,B,C,….
Chapter 2 Language & Syntax DescriptionSection 1 Alphabet & String
6 、 Operations on the sets of strings 1) 、 Concatenate (Product) Operation Let the string set A={1,2,…},B={1,2,...}, then (Cartesian) Product AB is defined as AB={|A and B}Notes : 1 ) String set product on self is called as power of the string set 2 ) A0={} 3 ) n powers of Alphabet A is the set of all strings with n length
Chapter 2 Language & Syntax DescriptionSection 1 Alphabet & String
6 、 Operations on the sets of strings 2) 、 Closure and positive closure a ) Closure A*=A0A1A2… It is meant by the set of all strings on Alphabet A(Including null-string ) b ) Positive closure A+=A1A2…=A*-{}Notes : A language is a subset of positive closure on the Alphabet.
Chapter 2 Language & Syntax DescriptionSection 2 Grammar & Language 1 、 Basic concepts
a 、 Grammar Grammar is the formal production rules describing
the construction of syntax elements. Notes : 1) Syntax elements include sentences and
words in sentences, a language is composed of sentences. 2) The form of a production rule is as following: left-sideright-side (that can be read as “left-side is
defined as right-side”, “left-side derives right-side”,or “left-side produces right-side”, it expresses the relation between the two sides)
b 、 Non-terminal symbol– A symbol that appears in the left of a rule , is bracketed
in <> and expresses a syntax concept.– A set of non-terminal symbols is expressed in VN
c 、 Terminal symbol– Strings in a language that cannot be decomposed
(including strings of single characters), expressed in VT. Notes : Terminal symbols are basic elements of a
sentence.
Chapter 2 Language & Syntax DescriptionSection 2 Grammar & Language 1 、 Basic concepts
Chapter 2 Language & Syntax DescriptionSection 2 Grammar & Language 1 、 Basic concepts
d 、 Start symbol– A special non-terminal symbol that is the core of
the defined syntax.
Notes : The start symbol is also named as “identified symbol”.
e 、 Production– A set of rules to define the relations among strings
The form : A ( A produce )E.g. <Sentence> <Subject><Predicate>
Chapter 2 Language & Syntax DescriptionSection 2 Grammar & Language 1 、 Basic concepts f 、 Derivation
– The process that starts from the Start Symbol, and derives a sentence by replacing the left-side with right side in a production rule.
– Leftmost (Rightmost) Derivation : Only use a production rule every time and replace the leftmost (Rightmost) Terminal Symbol with the right side
Notes : Leftmost (Rightmost) Derivation are called canonical derivation.
Chapter 2 Language & Syntax DescriptionSection 2 Grammar & Language 1 、 Basic concepts g 、 Reduction
– Reduction is the inverse process of derivation,that is, starting from a given sentence of a language, arriving at the Start Symbol by replacing the right-side with left-side of the production rules finally.
– Leftmost(Rightmost) Reduction is the inverse process of Rightmost(Leftmost) derivation.
Notes : Leftmost and Rightmost Reduction are called canonical reduction.
Chapter 2 Language & Syntax DescriptionSection 2 Grammar & Language 1 、 Basic concepts
h 、 Sentential form 、 Sentence & Language• Sentential form
– String that is produced from every derivation (including 0 derivation) from the Start Symbol. Written as S , ( VN VT)*
• Sentence– A sentential form that only include terminal symbo
l• Language
– The set of sentences (strings) that are produced from one or more derivation from S. Written as L(G), L(G)={|S , and VT
*}
*
+
Chapter 2 Language & Syntax DescriptionSection 2 Grammar & Language 1 、 Basic concepts i 、 Recursive definition of grammar rules
– A non-terminal symbol is included in the definition of the non-terminal symbol.
Notes : You should be careful when you define a grammar in a recursive method. You must give the exit statement (special case statement) of the recursion. Otherwise you can not get a sentence forever.
Chapter 2 Language & Syntax DescriptionSection 2 Grammar & Language 1 、 Basic concepts
j 、 Extended notations of grammar rules Use extended BNF(Backus Naur Form) not
ations– () ——Extract factor E.g. Uax|ay|az Rewritten as Ua(x|y|z)– {} ——Assignment of repeat number
E.g. <Identifier><Letter>{<Letter>|<Digit>}50.
– [] ——Optional symbol E.g. <Integer>[+|-]<Digit>{<Digit>}
Chapter 2 Language & Syntax DescriptionSection 2 Grammar & Language 1 、 Basic concepts
k 、 Meta-language symbol
The symbols that are used in describing the relations of grammar symbol, E.g. “” and “|” are called as meta-language symbol.
Chapter 2 Language & Syntax DescriptionSection 2 Grammar & Language 2 、 Formal definition
a 、 Grammar definition A grammar G is defined as a quadruple
(VN,VT,P,S) b 、 Catalog of grammars
According to the limitation on the production rules in a grammar, we can classify grammars into 4 sorts, such as ,0-type grammar 、 1-type grammar 、 2-type grammar and 3-type grammar
Chapter 2 Language & Syntax DescriptionSection 2 Grammar & Language 2 、 Formal definition
b 、 Catalog of grammars (1) 0-type grammar (Phrase grammar or grammar without
limitation)– To any production in P where V+ and V*, t
here is at least a non-terminal symbol in .Notes : The automation that can recognizes a 0-type la
nguage is called as Turing Machine; 0-type grammar is a grammar that has least limitatio
n on its productions; We can get other types of grammar by limiting the f
orm of productions in a 0-type grammar.
Chapter 2 Language & Syntax DescriptionSection 2 Grammar & Language 2 、 Formal definition
b 、 Catalog of grammars (2) 1-type grammar(context-sensitive grammar or length-add
ed grammar)– To any production in P,there is the limitation of ||
>=|| except for S . If S , S can not appear in the right side of any production.
– Or , any production in P has the form of A (where , V* ,A VN, V+) except for S .
Notes : The automation that can recognizes a 1-type language is called as Linear Bound (LBA) ; In a 1-type grammar, we should consider the context of a non-terminal symbol when we replace the non-terminal symbol. And a non-terminal symbol can not be replaced by except that the Start Symbol can produce
Chapter 2 Language & Syntax DescriptionSection 2 Grammar & Language 2 、 Formal definition
b 、 Catalog of grammars
(3) 2-type grammar(Context-free grammar)– Every production in P is of the form A where A
VN , V*.
Notes : The left side of each production should be a non-terminal symbol, the right side of each production may be VN , VT or .The automation that recognizes a 2-type language is called as Push-Down Automation(PDA)
Chapter 2 Language & Syntax DescriptionSection 2 Grammar & Language 2 、 Formal definition
b 、 Catalog of grammars (4)3-type grammar(Regular grammar, right-linear gra
mmar or left-linear grammar)– Every production in P is of the form A B , A ,
or A B , A , where A , BVN , VT* 。
Notes : The productions in 3-type grammar are right-linear productions or else left-linear productions. There cannot be either left-linear productions or right-linear productions. If all the productions in a 3-type grammar are left-linear productions, we call name grammar as left-linear grammar. If all the productions in a 3-type grammar are right-linear productions, we name the grammar as right-linear grammar.
Chapter 2 Language & Syntax DescriptionSection 2 Grammar & Language 2 、 Formal definition
b 、 Catalog of grammars (4)3-type grammar(Regular grammar, right-lin
ear grammar or left-linear grammar)Notes : The automation that recognizes 3-type la
nguage is called as finite state automation; 2-type grammar=self-embedded grammar(The
productions are of the form S aSb) +regular grammar, that is, any 2-type grammar without self-embedded property is equivalent to regular grammar.
Chapter 2 Language & Syntax DescriptionSection 2 Grammar & Language 2 、 Formal definition
b 、 Catalog of grammars
Hierarchy Alias Production form
Automation name
0-type Grammar without limitation
, V+ Turing Machine
1-type Context-sensitive grammar
A , A VN
Linear Bound Automation
2-type Context-free grammar
A,
A VN
Pushdown automation
3-type Regular grammar
A B , A , A , BVN , VT
*
Finite automation
Chapter 2 Language & Syntax DescriptionSection 2 Grammar & Language 2 、 Formal definition
c 、 i-type language– A language produced from i-type.
Written as L(G): L(G)={| VT* , and S }+
L(G1)={ai(a|b)|i>=0}
Example : LetG2 = ({S},{a,b},P,S)
Where P includes: (0) S aSb
(1) S ab
L(G2)={anbn|n>=1}
Example : Let G1 = ({S},{a,b},P,S)
Where P includes: (0) S aS
(1) S a
(2) S b
Chapter 2 Language & Syntax DescriptionSection 2 Grammar & Language 2 、 Formal definition
Notes : Limitations on productions in grammars used by lexical analysis and syntax analysis are as followings,– There is not the production such as P P, for this kind of
production would be useless but for leading to ambiguity– Any non-terminal symbol P should be accessed , and can
derive terminal string.• Start from the Start Symbol S , there exists the deriv
ation S P• P must be able to derive a terminal string ,
that is P ; VT*.
*
+
Chapter 2 Language & Syntax Description Section 3 Grammar construction and simplification1 、 Constructing a grammar from a language
Example1 : Let L1={a2nbn|n>=1 and a,b VT}
Try to construct the grammar G1 from L1
Let n=1 , L1 =aab n=2 , L1 =aaaabb
n=3 , L1 =aaaaaabbb …… So we have : S aaSb S aab
Chapter 2 Language & Syntax Description Section 3 Grammar construction and simplification1 、 Constructing a grammar from a language
Example 2 : Let L2={aibjck | i,j,k>=1 and a,b,c VT}
Try to construct the grammar G2 from L2
S aS S aB
B bB B bC
C cC | c
Chapter 2 Language & Syntax Description Section 3 Grammar construction and simplification1 、 Constructing a grammar from a language
Example 3 : Let L3={ | (a,b)* and there are as many a’s as b’s in }
Try to construct the grammar G3 from L3
S
S bB , S aA
A bS|b , A aAA
B aS | a | bBB
(0) S S aSbSS bSaS
Chapter 2 Language & Syntax Description Section 3 Grammar construction and simplification1 、 Constructing a grammar from a language
Example 4 : Let L4={ | (0,1)* and the number of 1 appeared in is even}
Try to construct the grammar G4 from L4
S
S 0S , S 1A
A 0A , A 1S
Chapter 2 Language & Syntax Description Section 3 Grammar construction and simplification
2 、 Grammar Simplification a 、 Because a language can be described in different gr
ammars, it is true that should select the grammar which has least productions and is the most suitable to the properties of the language.
b 、 In a grammar, there may be some redundant productions that are useless to derivation. We should delete these productions. – The production which is of the form PP– The production which can not derive a terminal string forever– The production whose left-side non-terminal symbol does not
appear in the right-side of any production
Chapter 2 Language & Syntax Description Section 3 Grammar construction and simplification2 、 Grammar Simplification
c 、 Steps of simplification :– Look for the productions of the form PP, and
delete them ;– If a production can not be used in the derivations
forever, delete it ;– If a production can not derive a terminal string,
delete it;– Arrange the remained productions.
Chapter 2 Language & Syntax Description Section 3 Grammar construction and simplification2 、 Grammar Simplification
Example : Simplify the following grammar
(0)S Be (1)S Ec (2)A Ae (3)A e
(4)A A (5)B Ce (6)B Af (7)C Cf
(8)D f
Result:
(0) S Be (1)A Ae (2)A e (3)B Af
Chapter 2 Language & Syntax Description Section 3 Grammar construction and simplification3 、 Construct a context-free grammar without -production
a 、 A context-free grammar without -production should satisfy the conditions as followings– If there is the production S of the form in P, S sho
uld not appear in right-side of any production, where S is the Start Symbol of the grammar ;
– There are no other -productions in P.
b 、 The algorithm to construct a context-free grammar without -production :– G=(VN,VT,P,S) G’=(V’N,V’T,P’,S’) (1) Find out all non-terminal symbols that can derive
after some steps, and put them into the set V0;
Chapter 2 Language & Syntax Description Section 3 Grammar construction and simplification3 、 Construct a context-free grammar without -productionb 、 The algorithm to construct a context-free grammar witho
ut -production : (2)Construct the P’ set of productions of G’ as following s
teps:
(A)If an symbol in V0 appears in the right-side of a production, change the production into two productions : substitute the symbol in and itself in the production respectively ; put the new productions into P’
( B)Otherwise, put the productions relating to the symbol into P’ except for -production relating to the symbol
( C)If there exists the production of the form S in P, change the production into S’ | S and put them into P’,let S’ be the Start Symbol of G’ , let V’N=VN{S’ } ,
Example : Let G1=({S},{a,b},P,S),whereP: (0) S (1) S aSbS (2) S bSaS
(1)V0={S}
(2)P’ (1) SabS|aSbS|aSb|ab
(2) SbaS|bSaS|bSa|ba
(0) S’ | S
So : G1’=({S’,S},{a,b},P’,S’),where
P’: (0) S’ | S
(1) S abS|aSbS|aSb|ab
(2) S baS|bSaS|bSa|ba
Chapter 2 Language & Syntax Description Section 4 Syntax tree and ambiguity of a grammar1 、 Syntax treea 、 Definition
– A tree used to express the structure of a sentence in a language
b 、 Function– Present the syntax analysis process visually and
directly– Used to decide the ambiguity of a grammar easily
S
a B
a B B
b S
b A
a
b
An example to syntax tree
Chapter 2 Language & Syntax Description Section 4 Syntax tree and ambiguity of a grammar1 、 Syntax tree
c 、 Basic terms in a syntax tree (1) Sub-tree A tree composed of a node (except for leaf) and all its
descendent nodes in a syntax tree (2) Pruning sub-tree Prune all the children of the root of a sub-tree (3) Sentential form Sequences of all leafs appearing in a snap-shot of the
growing syntax tree
Chapter 2 Language & Syntax Description Section 4 Syntax tree and ambiguity of a grammar1 、 Syntax tree
c 、 Basic terms in a syntax tree (4) Phrase A string of end-symbol sequence from left to rig
ht in a sub-tree is called a phrase relating to the root of the sub-tree.– Simple phrase(Direct phrase) : If a phrase is derived by
1 step from the root of a sub-tree, the phrase is called a simple phrase relating to the root of the sub-tree.
– Phrase in a sentential form : A phrase to a sub-tree relating to the sentential form
Chapter 2 Language & Syntax Description Section 4 Syntax tree and ambiguity of a grammar1 、 Syntax tree
c 、 Basic terms in a syntax tree
(5) Handle
A leftmost simple phrase in a sentential form.
Notes: In the process of leftmost recursion, the core work is seeking for the handle.
S
a B
a B B
b S
b A
a
b
Handles to a syntax tree
2
43
6
5
1
Chapter 2 Language & Syntax Description Section 4 Syntax tree and ambiguity of a grammar2 、 Ambiguity of a grammar
a 、 Ambiguity of a sentence
If a sentence in a grammar has two or more related syntax tree, the sentence is ambiguous.
b 、 Ambiguity of a grammarIf a language to a grammar has ambiguous senten
ces, the grammar is ambiguous.
Chapter 2 Language & Syntax Description Section 4 Syntax tree and ambiguity of a grammar2 、 Ambiguity of a grammar
Example : G=({E} , {+,*,(,),i} , P , E)where : E E+E | E*E | (E) | i
To the sentence (i* i+ i), there are two leftmost derivations, thus there are two syntax trees to the sentence.
(1) E (E) (E+E) (E*E+E) ( i*E+E) ( i*i+E) ( i* i+ i)
(2) E (E) (E*E) ( i*E) ( i*E+E) ( i*i+E) ( i* i+ i)
E
( E )
E + E
E * E i
i i
E
( E )
E * E
E + E i
i i
Chapter 2 Language & Syntax Description Section 4 Syntax tree and ambiguity of a grammar2 、 Ambiguity of a grammar
Notes: (1)Ambiguity would bring uncertainty of syntax analysis
(2)Ambiguity of a grammar is undetermined, that is, there is no such algorithm that can determine a grammar is an ambiguous grammar in finite steps
(3)If you want to prove a grammar is ambiguous, you just give a counterexample
(4)If we can control the ambiguity of a grammar, that is, use additional conditions, the existence of ambiguity is not so bad