(211041612) 9a

21
Chomsky Normal Form We introduce Chomsky Normal Form, which is used to answer questions about context-free languages

Upload: dasariorama

Post on 28-May-2017

237 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: (211041612) 9a

Chomsky Normal Form

We introduce Chomsky Normal Form, which is used to answer questions about context-free

languages

Page 2: (211041612) 9a

Goddard 9a:

Chomsky Normal Form

Chomsky Normal Form. A grammar whereevery production is either of the form A → BC or A → c (where A, B, C are arbitrary variables and c an arbitrary symbol).

Example:

S → AS | aA → SA | b

(If language contains ε, then we allow S → εwhere S is start symbol, and forbid S on RHS.)

Page 3: (211041612) 9a

Goddard 9a:

Why Chomsky Normal Form?

The key advantage is that in Chomsky NormalForm, every derivation of a string of n letters has exactly 2n − 1 steps.

Thus: one can determine if a string is in the lan- guage by exhaustive search of all derivations.

Page 4: (211041612) 9a

Goddard 9a:

Conversion

The conversion to Chomsky Normal Form hasfour main steps:

1. Get rid of all ε productions.

2. Get rid of all productions where RHS is one variable.

3. Replace every production that is too long by shorter productions.

4. Move all terminals to productions where RHSis one terminal.

Page 5: (211041612) 9a

Goddard 9a:

1) Eliminate ε Productions

Determine the nullable variables (those that gen-erate ε) (algorithm given earlier).

Go through all productions, and for each, omit every possible subset of nullable variables.

For example, if P → AxB with both A and Bnullable, add productions P → xB | Ax | x.

After this, delete all productions with empty RHS.

Page 6: (211041612) 9a

Goddard 9a:

2) Eliminate Variable Unit Productions

A unit production is where RHS has only onesymbol.

Consider production A → B. Then for every pro- duction B → α, add the production A → α. Re- peat until done (but don’t re-create a unit pro- duction already deleted).

Page 7: (211041612) 9a

Goddard 9a:

3) Replace Long Productions by Shorter Ones

For example, if have production A → BC D, then replace it with A → BE and E → C D.

(In theory this introduces many new variables, but one can re-use variables if careful.)

Page 8: (211041612) 9a

Goddard 9a:

4) Move Terminals to Unit Productions

For every terminal on the right of a non-unitproduction, add a substitute variable.

For example, replace production A → bC with productions A → BC and B → b.

Page 9: (211041612) 9a

Goddard 9a:

Example

Consider the CFG:S → aX bXX → aY | bY | εY → X | c

The variable X is nullable; and so therefore is Y . After elimination of ε, we obtain:

S → aX bX | abX | aX b | abX → aY | bY | a | bY → X | c

Page 10: (211041612) 9a

Goddard 9a:

Example: Step 2

After elimination of the unit production Y → X , we obtain:

S → aX bX | abX | aX b | abX → aY | bY | a | bY → aY | bY | a | b | c

Page 11: (211041612) 9a

Goddard 9a:

Example: Steps 3 & 4

Now, break up the RHSs of S; and replace a by A,b by B and c by C wherever not units:

S → EF | AF | EB | AB X → AY | BY | a | bY → AY | BY | a | b | cE → AX F → BX A → aB → b

Page 12: (211041612) 9a

Goddard 9a:

C → c

Page 13: (211041612) 9a

Goddard 9a:

Practice

Convert the following CFG into Chomsky Nor -mal Form:

S → AbAA → Aa | ε

Page 14: (211041612) 9a

Goddard 9a:

Solution to Practice

After the first step, one has:

S → AbA | bA | Ab | bA → Aa | a

The second step does not apply. After the third step, one has:

S → T A | bA | Ab | bA → Aa | aT → Ab

Page 15: (211041612) 9a

Goddard 9a:

Solution Continued

And finally, one has:

S → T A | BA | AB | bA → AC | aT → AB B → bC → a

Page 16: (211041612) 9a

Goddard 9a:

Summary

There are special forms for CFGs such as Chom-sky Normal Form, where every production has the form A → BC or A → c. The algorithm to convert to this form involves (1) determin- ing all nullable variables and getting rid of all ε-productions, (2) getting rid of all variable unit productions, (3) breaking up long productions, and (4) moving terminals to unit productions.