25.11.2003csa3050: parsing algorithms 11 csa350: nlp algorithms parsing algorithms 1 top down...

20
25.11.2003 csa3050: Parsing Algorith ms 1 1 CSA350: NLP Algorithms Parsing Algorithms 1 • Top Down • Bottom-Up • Left Corner

Upload: vivian-mckenzie

Post on 18-Jan-2016

228 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 25.11.2003csa3050: Parsing Algorithms 11 CSA350: NLP Algorithms Parsing Algorithms 1 Top Down Bottom-Up Left Corner

25.11.2003 csa3050: Parsing Algorithms 1 1

CSA350: NLP Algorithms

Parsing Algorithms 1

• Top Down

• Bottom-Up

• Left Corner

Page 2: 25.11.2003csa3050: Parsing Algorithms 11 CSA350: NLP Algorithms Parsing Algorithms 1 Top Down Bottom-Up Left Corner

25.11.2003 csa3050: Parsing Algorithms 1 2

References

• This lecture is based on material found in Juracky & Martin chapter 10.

• Relevant material available from Vince.

Page 3: 25.11.2003csa3050: Parsing Algorithms 11 CSA350: NLP Algorithms Parsing Algorithms 1 Top Down Bottom-Up Left Corner

25.11.2003 csa3050: Parsing Algorithms 1 3

Simple Grammar

fl ||||

| |

Page 4: 25.11.2003csa3050: Parsing Algorithms 11 CSA350: NLP Algorithms Parsing Algorithms 1 Top Down Bottom-Up Left Corner

25.11.2003 csa3050: Parsing Algorithms 1 4

Parsing Problem

• Find all trees such that:– root is S– leaves exactly cover all the input words, e.g.

fl

Page 5: 25.11.2003csa3050: Parsing Algorithms 11 CSA350: NLP Algorithms Parsing Algorithms 1 Top Down Bottom-Up Left Corner

25.11.2003 csa3050: Parsing Algorithms 1 5

Parsing as Search

• Search within a space defined by– Start State– Goal State– State to state transformations

• Shape of space depends on parsing strategy• Two distinct strategies for finding the parse

trees:– Top down– Bottom up

Page 6: 25.11.2003csa3050: Parsing Algorithms 11 CSA350: NLP Algorithms Parsing Algorithms 1 Top Down Bottom-Up Left Corner

25.11.2003 csa3050: Parsing Algorithms 1 6

Top Down

• Top down parser tries to build from the root node S down to the leaves by replacing nodes with non-terminal labels with RHS of corresponding grammar rules.

• Nodes with pre-terminal (word class) labels are compared to input words.

Page 7: 25.11.2003csa3050: Parsing Algorithms 11 CSA350: NLP Algorithms Parsing Algorithms 1 Top Down Bottom-Up Left Corner

25.11.2003 csa3050: Parsing Algorithms 1 7

Top Down Search Space

Start node →

Goal node↓

Page 8: 25.11.2003csa3050: Parsing Algorithms 11 CSA350: NLP Algorithms Parsing Algorithms 1 Top Down Bottom-Up Left Corner

25.11.2003 csa3050: Parsing Algorithms 1 8

Bottom Up

• Each state is a forest of trees.

• Start node is a forest of nodes labelled with pre-terminal categories (word classes derived from lexicon)

• Transformations look for places where RHS of rules can fit.

• Any such place is replaced with a node labelled with LHS of rule.

Page 9: 25.11.2003csa3050: Parsing Algorithms 11 CSA350: NLP Algorithms Parsing Algorithms 1 Top Down Bottom-Up Left Corner

25.11.2003 csa3050: Parsing Algorithms 1 9

Bottom Up Search Space

fl fl

fl fl fl

fl fl

Page 10: 25.11.2003csa3050: Parsing Algorithms 11 CSA350: NLP Algorithms Parsing Algorithms 1 Top Down Bottom-Up Left Corner

25.11.2003 csa3050: Parsing Algorithms 1 10

Top Down vs Bottom Up

• Top down – For: Never wastes

time exploring trees that cannot be derived from S

– Against: Can generate trees that are not consistent with the input

• Bottom up– For: Never wastes

time building trees that cannot lead to input text segments.

– Against: Can generate subtrees that can never lead to an S node.

Page 11: 25.11.2003csa3050: Parsing Algorithms 11 CSA350: NLP Algorithms Parsing Algorithms 1 Top Down Bottom-Up Left Corner

25.11.2003 csa3050: Parsing Algorithms 1 11

Development of a Concrete Strategy

• Combine best features of both top down and bottom up strategies.– Top down, grammar directed control.– Bottom up filtering.

• Examination of alternatives in parallel uses too much memory.

• Depth first strategy using agenda-based control.

Page 12: 25.11.2003csa3050: Parsing Algorithms 11 CSA350: NLP Algorithms Parsing Algorithms 1 Top Down Bottom-Up Left Corner

25.11.2003 csa3050: Parsing Algorithms 1 12

Top Down Algorithm

Page 13: 25.11.2003csa3050: Parsing Algorithms 11 CSA350: NLP Algorithms Parsing Algorithms 1 Top Down Bottom-Up Left Corner

25.11.2003 13

Derivation top down,

left-to-right,

depth first

Page 14: 25.11.2003csa3050: Parsing Algorithms 11 CSA350: NLP Algorithms Parsing Algorithms 1 Top Down Bottom-Up Left Corner

25.11.2003 csa3050: Parsing Algorithms 1 14

A Problem with the Algorithm

• Note that the first three steps of the parse involve a failed attempt to expand the first ruleS → NP VP.

• The parser recursively expands the leftmost NT of this rule (NP).

• While all this work is going on, the input is not even consulted!

• Only when a terminal symbol is encountered is the input compared and the failure discovered.

• This is pretty inefficient.

Page 15: 25.11.2003csa3050: Parsing Algorithms 11 CSA350: NLP Algorithms Parsing Algorithms 1 Top Down Bottom-Up Left Corner

25.11.2003 csa3050: Parsing Algorithms 1 15

Bottom Up Filtering

• We know the current input word must serve as the first word in the derivation of the unexpanded node the parser is currently processing.

• Therefore the parser should not consider grammar rule for which the current word cannot serve as the "left corner"

• The left corner is the first word (or preterminal node) along the left edge of a derivation.

Page 16: 25.11.2003csa3050: Parsing Algorithms 11 CSA350: NLP Algorithms Parsing Algorithms 1 Top Down Bottom-Up Left Corner

25.11.2003 csa3050: Parsing Algorithms 1 16

Left Corner

The nodes Verb and prefer are each left corners of VP

fl fl

Page 17: 25.11.2003csa3050: Parsing Algorithms 11 CSA350: NLP Algorithms Parsing Algorithms 1 Top Down Bottom-Up Left Corner

25.11.2003 csa3050: Parsing Algorithms 1 17

Left Corner

• B is a left corner of A iffA * Bαfor non-terminal A, pre-terminal B and symbol string α.

• Possible left corners of all non-terminal categories can be determined in advance and placed in a table.

Page 18: 25.11.2003csa3050: Parsing Algorithms 11 CSA350: NLP Algorithms Parsing Algorithms 1 Top Down Bottom-Up Left Corner

25.11.2003 csa3050: Parsing Algorithms 1 18

Example of Left Corner Table

Category Left Corners

S

NP

Nominal

VP

Det, Proper-Noun, Aux, Verb

Det, Proper-Noun

Noun

Verb

Page 19: 25.11.2003csa3050: Parsing Algorithms 11 CSA350: NLP Algorithms Parsing Algorithms 1 Top Down Bottom-Up Left Corner

25.11.2003 csa3050: Parsing Algorithms 1 19

How to use the Left Corner Table

• If attempting to parse category A, only consider rules A → Bα for which category(current input) LeftCorners(B)

• S → NP VPS → Aux NP VPS → VP

Page 20: 25.11.2003csa3050: Parsing Algorithms 11 CSA350: NLP Algorithms Parsing Algorithms 1 Top Down Bottom-Up Left Corner

25.11.2003 csa3050: Parsing Algorithms 1 20

Next Week

• Problems with top down parser

• left recursion

• repeated work

• Early Algorithm

• Assignment

• See J & M ch 10