simplification of cfg and normal formswgtzeng/courses...normal forms •we want a cfg with either...
TRANSCRIPT
Simplification of CFG and Normal Forms
Wen-Guey Tzeng
Computer Science Department
National Chiao Tung University
Normal Forms
• We want a cfg with either Chomsky or Greibach normal form
– Chomsky normal form
• Aa, ABC
– Greibach normal form
• Aax, xV*
22016 Spring
• CFG with normal forms are easier for parsing– The membership problem
– Given a grammar G and a string w, find the parsing tree for w if a parsing tree exists.
3
w = x+y*z
2016 Spring
• -free languages
– A language that does not contain
• We consider CFG G such that L(G) is -free
• For any cfg G, there is G’ such that L(G’)=L(G)-{}
42016 Spring
Transformation to normal forms: steps
5
CFG G=(V, T, P, S)
(-free context-free language)
Remove
(1) -productions(2) unit-productions(3) useless productionsfrom P to get G’
Convert G’ to normal forms
2016 Spring
A substitution rule
• For AB, A x1Bx2, By1|y2|…|yn
is equivalent toAx1y1x2|x1y2x2|…|x1ynx2, By1|y2|…|yn
• Example
– Aa|aaA|abBc, BabbA|bis equivalent toAa|aaA|ababbAc|abbc, BabbA|b
62016 Spring
Remove -productions
• -production: A
• Nullable variable A: A*
• Steps
1. Find the nullable variable set VN
2. For each Ax1x2…xm, xiVT,
• For each combination xi, xj, …, xk of variables in VN
add Ax1 …xi-1 xi+1… xj-1 xj+1 ... xk-1 xk+1…xm
• Note: don’t add A, if all xi are in VN
72016 Spring
Remove unit-productions
• unit-production: AB
• Steps
– Remove AA immediately
– Draw dependency graph for variables A and B with:A*B
– For A*B and By1|y2|…|yn
• Add Ay1|y2|…|yn
– Remove all AB, where A and B are in dependency graph
92016 Spring
Example
• S Aa|B, BA|bb, Aa|bc|B
• Draw dependency graph
10
1. Remove unit productionsS Aa, Bbb, Aa|bc
2. AddSbb|a|bcAbb
Ba|bc
3. FinallySa|bc|bb|AaAa|bc|bbBa|bc|bb
2016 Spring
Remove useless productions
• A variable AV is useful if S can generate some terminal string through it.
– That is, S * xAy * w, wT*
• Example
– SaSb|AB|Ba, AaA, Bb|Bb, CcB|c
– S Ba ba. Thus, B is useful.
– S is useful.
– But, A and C are not useful (useless)
112016 Spring
• Two cases for useless variables
– Case 1: variables that cannot generate strings in T*
• SaSb|AB|Ba, AaA, Bb|Bb, CcB|c
• Algorithm (finding variables that generate strings)
1. V1={}
2. For rule Ax, x(TV1)*, add A to V1
3. Repeat 2 until no rules can be added to V1
• V1={S, B, C}
• SaSb|Ba, Bb|Bb, CcB|c
122016 Spring
– Case 2: variables that cannot be reached from S
• SaSb|Ba, Bb|Bb, CcB|c
• Algorithm: dependency graph
• C is un-reachable from S.
• SaSb|Ba, Bb|Bb
13
S B C
2016 Spring
• Algorithm (removing useless productions)
Input: G=(V, T, P, S)
1. Find the useless variables in Case 1 and remove related useless productions.
2. Find the useless (un-reachable) variables in Case 2 and remove the related useless productions
142016 Spring
Chomosy normal form
• A cfg is in Chomsky normal form (CNF) if all productions are of form
ABC, or Aa
• Example
– SAS|a, ASA|b
• Every cfg G, with L(G), has an equivalent CNF grammar.
152016 Spring
Converting into CNF
1. Apply the rules of removing -, unit-, and useless-productions
2. Convert the productions into the formAC1C2…Cn, or Aa
3. Convert AC1C2…Cn into AC1D1, D1C2D2, …, Dn-2Cn-1Cn
162016 Spring
Greibach normal form
• A cfg is in Greibach normal form (GNF) if all productions are of form
AaB1B2…Bn, n0
• Example
– SaBC, BaBA, Aa|bBSC
• Every cfg G, with L(G), has an equivalent GNF grammar.
182016 Spring
Example
• Example
– SAB, AaA|bB|b, Bb
– Result
• SaAB|bBB|bB, AaA|bB|b, Bb
• Example
– SabSb|aa
– Result
• SaBSB|aA, Bb, Aa
192016 Spring
Parsing (membership)
• Question: Given a CFG G in Chomsky normal form and a string w, determine whether wL(G)
• Idea: the dynamic programming technique
– A large problem is decomposed into smaller problems
– Combine solutions to smaller problems into a solution for the large problem
202016 Spring
• Assume w=a1a2…an
• Use the dynamic programming technique
– Vij={ V : V* aiai+1…aj}: variables that can generate substring aiai+1…aj
• Solve smaller problems Vik, Vk+1,j, for k=i, i+1,…, j-1
• Combine them to compute Vij
– Vij = {A:ABC, BVik, CVK+1,j, ik<j}
212016 Spring
22
w = a1 a2 a3 … ai ai+1 … aj-1 aj … an
Vij contains the variables that generate aiai+1…aj-1aj
ai ai+1 … ak ak+1 ak+2 … aj-1 aj
Vi k Vk+1 j
Vk+2 jVi k+1
. . .
. . .
2016 Spring
• Triangular table (n=5)
23
V1,5
V1,4 V2,5
V1,3 V2,4 V3,5
V1,2 V2,3 V3,4 V4,5
V1,1 V2,2 V3,3 V4,4 V5,5
2016 Spring
CYK Algorithm
• Input: G=(V, T, S, P) is in CNF and w=a1a2…an
– Compute Vij={ AV : A* aiai+1…aj}• V11, V22, …, Vnn
• V12, V23, …, Vn-1n
• …
• V1n
1. Smallest problem: add A to Vii
• if Aai is a production in P
2. Bigger problem: add to A to Vij if• For some k, ikj-1, ABC in P, B in Vik, C in Vk+1 j
3. wL(G) if and only if SV1n
242016 Spring
Example
• SAB, ABB|a, BAB|b
• w=aabbb
• Steps
– V11={A}, V22={A}, V33={B}, V44={B}, V55={B}
– V12=, V23={S, B}, V34={A}, V45={A}
– V13={S, B}, V24={A}, V35={S, B}
– V14={A}, V25={S, B}
– V15={S, B}
252016 Spring
Sum up
• Context-free grammars are used in designing programming languages, such as , C, PSACAL, etc.
• Membership problem in CFG is equivalent to the parsing problem in programming languages
• Normal forms are needed for “automatically” generating a “parser” for the programming language
262016 Spring