lecture 10 parsing ii - university of california, san...
TRANSCRIPT
Probabilistic CKY
Roger Levy
[thanks to Jason Eisner]
Managing Ambiguity
• John saw Mary • Typhoid Mary • Phillips screwdriver Mary note how rare rules interact
• I see a bird • is this 4 nouns – parsed like “city park scavenger bird”? rare parts of speech, plus systematic ambiguity in noun sequences
• Time flies like an arrow • Fruit flies like a banana • Time reactions like this one • Time reactions like a chemist • or is it just an NP?
Our bane: Ambiguity
• John saw Mary • Typhoid Mary • Phillips screwdriver Mary note how rare rules interact
• I see a bird • is this 4 nouns – parsed like “city park scavenger bird”? rare parts of speech, plus systematic ambiguity in noun sequences
• Time | flies like an arrow NP VP • Fruit flies | like a banana NP VP • Time | reactions like this one V[stem] NP • Time reactions | like a chemist S PP • or is it just an NP?
How to solve this combinatorial explosion of ambiguity?
1. First try parsing without any weird rules, throwing them in only if needed.
2. Better: every rule has a weight. A tree’s weight is total weight of all its rules. Pick the overall lightest parse of sentence.
3. Can we pick the weights automatically? We’ll get to this later …
The plan for the rest of parsing
• There’s probability (inference) and then there’s statistics (model estimation)
• These two problems are logically separate, yet interrelated in practice
• We’ll start with the problem of efficient inference (probabilistic bottom-up parsing)
• Then we’ll move to the estimation of good (generative) models that permit efficient bottom-up parsing
• Finally, we’ll mention state-of-the-art work that doesn’t permit exact inference (“reranking” approaches)
The critical recursion
• We’ll do bottom-up parsing (weighted CKY) • When combining constituents into a larger constituent:
• The weight of the new constituent is the sum of the weights of the combined subconstituents…
• …plus the weight of the rule used to combine the subconstituents
time 1 flies 2 like 3 an 4 arrow 5
0
NP 3 Vst 3
1 NP 4 VP 4
2 P 2 V 5
3 Det 1 4 N 8
1 S → NP VP 6 S → Vst NP 2 S → S PP
1 VP → V NP 2 VP → VP PP
1 NP → Det N 2 NP → NP PP 3 NP → NP NP
0 PP → P NP
time 1 flies 2 like 3 an 4 arrow 5
0
NP 3 Vst 3
1 NP 4 VP 4
2 P 2 V 5
3 Det 1 4 N 8
1 S → NP VP 6 S → Vst NP 2 S → S PP
1 VP → V NP 2 VP → VP PP
1 NP → Det N 2 NP → NP PP 3 NP → NP NP
0 PP → P NP
time 1 flies 2 like 3 an 4 arrow 5
0
NP 3 Vst 3
NP 10
1 NP 4 VP 4
2 P 2 V 5
3 Det 1 4 N 8
1 S → NP VP 6 S → Vst NP 2 S → S PP
1 VP → V NP 2 VP → VP PP
1 NP → Det N 2 NP → NP PP 3 NP → NP NP
0 PP → P NP
time 1 flies 2 like 3 an 4 arrow 5
0
NP 3 Vst 3
NP 10 S 8
1 NP 4 VP 4
2 P 2 V 5
3 Det 1 4 N 8
1 S → NP VP 6 S → Vst NP 2 S → S PP
1 VP → V NP 2 VP → VP PP
1 NP → Det N 2 NP → NP PP 3 NP → NP NP
0 PP → P NP
time 1 flies 2 like 3 an 4 arrow 5
0
NP 3 Vst 3
NP 10 S 8 S 13
1 NP 4 VP 4
2 P 2 V 5
3 Det 1 4 N 8
1 S → NP VP 6 S → Vst NP 2 S → S PP
1 VP → V NP 2 VP → VP PP
1 NP → Det N 2 NP → NP PP 3 NP → NP NP
0 PP → P NP
time 1 flies 2 like 3 an 4 arrow 5
0
NP 3 Vst 3
NP 10 S 8 S 13
1 NP 4 VP 4
2 P 2 V 5
3 Det 1 4 N 8
1 S → NP VP 6 S → Vst NP 2 S → S PP
1 VP → V NP 2 VP → VP PP
1 NP → Det N 2 NP → NP PP 3 NP → NP NP
0 PP → P NP
time 1 flies 2 like 3 an 4 arrow 5
0
NP 3 Vst 3
NP 10 S 8 S 13
1 NP 4 VP 4
2 P 2 V 5
3 Det 1 NP 10 4 N 8
1 S → NP VP 6 S → Vst NP 2 S → S PP
1 VP → V NP 2 VP → VP PP
1 NP → Det N 2 NP → NP PP 3 NP → NP NP
0 PP → P NP
time 1 flies 2 like 3 an 4 arrow 5
0
NP 3 Vst 3
NP 10 S 8 S 13
1 NP 4 VP 4
2 P 2 V 5
3 Det 1 NP 10 4 N 8
1 S → NP VP 6 S → Vst NP 2 S → S PP
1 VP → V NP 2 VP → VP PP
1 NP → Det N 2 NP → NP PP 3 NP → NP NP
0 PP → P NP
time 1 flies 2 like 3 an 4 arrow 5
0
NP 3 Vst 3
NP 10 S 8 S 13
1 NP 4 VP 4
2 P 2 V 5
PP 12
3 Det 1 NP 10 4 N 8
1 S → NP VP 6 S → Vst NP 2 S → S PP
1 VP → V NP 2 VP → VP PP
1 NP → Det N 2 NP → NP PP 3 NP → NP NP
0 PP → P NP
time 1 flies 2 like 3 an 4 arrow 5
0
NP 3 Vst 3
NP 10 S 8 S 13
1 NP 4 VP 4
2 P 2 V 5
PP 12 VP 16
3 Det 1 NP 10 4 N 8
1 S → NP VP 6 S → Vst NP 2 S → S PP
1 VP → V NP 2 VP → VP PP
1 NP → Det N 2 NP → NP PP 3 NP → NP NP
0 PP → P NP
time 1 flies 2 like 3 an 4 arrow 5
0
NP 3 Vst 3
NP 10 S 8 S 13
1 NP 4 VP 4
2 P 2 V 5
PP 12 VP 16
3 Det 1 NP 10 4 N 8
1 S → NP VP 6 S → Vst NP 2 S → S PP
1 VP → V NP 2 VP → VP PP
1 NP → Det N 2 NP → NP PP 3 NP → NP NP
0 PP → P NP
time 1 flies 2 like 3 an 4 arrow 5
0
NP 3 Vst 3
NP 10 S 8 S 13
1 NP 4 VP 4
NP 18
2 P 2 V 5
PP 12 VP 16
3 Det 1 NP 10 4 N 8
1 S → NP VP 6 S → Vst NP 2 S → S PP
1 VP → V NP 2 VP → VP PP
1 NP → Det N 2 NP → NP PP 3 NP → NP NP
0 PP → P NP
time 1 flies 2 like 3 an 4 arrow 5
0
NP 3 Vst 3
NP 10 S 8 S 13
1 NP 4 VP 4
NP 18 S 21
2 P 2 V 5
PP 12 VP 16
3 Det 1 NP 10 4 N 8
1 S → NP VP 6 S → Vst NP 2 S → S PP
1 VP → V NP 2 VP → VP PP
1 NP → Det N 2 NP → NP PP 3 NP → NP NP
0 PP → P NP
time 1 flies 2 like 3 an 4 arrow 5
0
NP 3 Vst 3
NP 10 S 8 S 13
1 NP 4 VP 4
NP 18 S 21 VP 18
2 P 2 V 5
PP 12 VP 16
3 Det 1 NP 10 4 N 8
1 S → NP VP 6 S → Vst NP 2 S → S PP
1 VP → V NP 2 VP → VP PP
1 NP → Det N 2 NP → NP PP 3 NP → NP NP
0 PP → P NP
time 1 flies 2 like 3 an 4 arrow 5
0
NP 3 Vst 3
NP 10 S 8 S 13
1 NP 4 VP 4
NP 18 S 21 VP 18
2 P 2 V 5
PP 12 VP 16
3 Det 1 NP 10 4 N 8
1 S → NP VP 6 S → Vst NP 2 S → S PP
1 VP → V NP 2 VP → VP PP
1 NP → Det N 2 NP → NP PP 3 NP → NP NP
0 PP → P NP
time 1 flies 2 like 3 an 4 arrow 5
0
NP 3 Vst 3
NP 10 S 8 S 13
NP 24
1 NP 4 VP 4
NP 18 S 21 VP 18
2 P 2 V 5
PP 12 VP 16
3 Det 1 NP 10 4 N 8
1 S → NP VP 6 S → Vst NP 2 S → S PP
1 VP → V NP 2 VP → VP PP
1 NP → Det N 2 NP → NP PP 3 NP → NP NP
0 PP → P NP
time 1 flies 2 like 3 an 4 arrow 5
0
NP 3 Vst 3
NP 10 S 8 S 13
NP 24 S 22
1 NP 4 VP 4
NP 18 S 21 VP 18
2 P 2 V 5
PP 12 VP 16
3 Det 1 NP 10 4 N 8
1 S → NP VP 6 S → Vst NP 2 S → S PP
1 VP → V NP 2 VP → VP PP
1 NP → Det N 2 NP → NP PP 3 NP → NP NP
0 PP → P NP
time 1 flies 2 like 3 an 4 arrow 5
0
NP 3 Vst 3
NP 10 S 8 S 13
NP 24 S 22 S 27
1 NP 4 VP 4
NP 18 S 21 VP 18
2 P 2 V 5
PP 12 VP 16
3 Det 1 NP 10 4 N 8
1 S → NP VP 6 S → Vst NP 2 S → S PP
1 VP → V NP 2 VP → VP PP
1 NP → Det N 2 NP → NP PP 3 NP → NP NP
0 PP → P NP
time 1 flies 2 like 3 an 4 arrow 5
0
NP 3 Vst 3
NP 10 S 8 S 13
NP 24 S 22 S 27
1 NP 4 VP 4
NP 18 S 21 VP 18
2 P 2 V 5
PP 12 VP 16
3 Det 1 NP 10 4 N 8
1 S → NP VP 6 S → Vst NP 2 S → S PP
1 VP → V NP 2 VP → VP PP
1 NP → Det N 2 NP → NP PP 3 NP → NP NP
0 PP → P NP
time 1 flies 2 like 3 an 4 arrow 5
0
NP 3 Vst 3
NP 10 S 8 S 13
NP 24 S 22 S 27 NP 24
1 NP 4 VP 4
NP 18 S 21 VP 18
2 P 2 V 5
PP 12 VP 16
3 Det 1 NP 10 4 N 8
1 S → NP VP 6 S → Vst NP 2 S → S PP
1 VP → V NP 2 VP → VP PP
1 NP → Det N 2 NP → NP PP 3 NP → NP NP
0 PP → P NP
time 1 flies 2 like 3 an 4 arrow 5
0
NP 3 Vst 3
NP 10 S 8 S 13
NP 24 S 22 S 27 NP 24 S 27
1 NP 4 VP 4
NP 18 S 21 VP 18
2 P 2 V 5
PP 12 VP 16
3 Det 1 NP 10 4 N 8
1 S → NP VP 6 S → Vst NP 2 S → S PP
1 VP → V NP 2 VP → VP PP
1 NP → Det N 2 NP → NP PP 3 NP → NP NP
0 PP → P NP
time 1 flies 2 like 3 an 4 arrow 5
0
NP 3 Vst 3
NP 10 S 8 S 13
NP 24 S 22 S 27 NP 24 S 27 S 22
1 NP 4 VP 4
NP 18 S 21 VP 18
2 P 2 V 5
PP 12 VP 16
3 Det 1 NP 10 4 N 8
1 S → NP VP 6 S → Vst NP 2 S → S PP
1 VP → V NP 2 VP → VP PP
1 NP → Det N 2 NP → NP PP 3 NP → NP NP
0 PP → P NP
time 1 flies 2 like 3 an 4 arrow 5
0
NP 3 Vst 3
NP 10 S 8 S 13
NP 24 S 22 S 27 NP 24 S 27 S 22 S 27
1 NP 4 VP 4
NP 18 S 21 VP 18
2 P 2 V 5
PP 12 VP 16
3 Det 1 NP 10 4 N 8
1 S → NP VP 6 S → Vst NP 2 S → S PP
1 VP → V NP 2 VP → VP PP
1 NP → Det N 2 NP → NP PP 3 NP → NP NP
0 PP → P NP
time 1 flies 2 like 3 an 4 arrow 5
0
NP 3 Vst 3
NP 10 S 8 S 13
NP 24 S 22 S 27 NP 24 S 27 S 22 S 27
1 NP 4 VP 4
NP 18 S 21 VP 18
2 P 2 V 5
PP 12 VP 16
3 Det 1 NP 10 4 N 8
1 S → NP VP 6 S → Vst NP 2 S → S PP
1 VP → V NP 2 VP → VP PP
1 NP → Det N 2 NP → NP PP 3 NP → NP NP
0 PP → P NP
S Follow backpointers …
time 1 flies 2 like 3 an 4 arrow 5
0
NP 3 Vst 3
NP 10 S 8 S 13
NP 24 S 22 S 27 NP 24 S 27 S 22 S 27
1 NP 4 VP 4
NP 18 S 21 VP 18
2 P 2 V 5
PP 12 VP 16
3 Det 1 NP 10 4 N 8
1 S → NP VP 6 S → Vst NP 2 S → S PP
1 VP → V NP 2 VP → VP PP
1 NP → Det N 2 NP → NP PP 3 NP → NP NP
0 PP → P NP
S
NP VP
time 1 flies 2 like 3 an 4 arrow 5
0
NP 3 Vst 3
NP 10 S 8 S 13
NP 24 S 22 S 27 NP 24 S 27 S 22 S 27
1 NP 4 VP 4
NP 18 S 21 VP 18
2 P 2 V 5
PP 12 VP 16
3 Det 1 NP 10 4 N 8
1 S → NP VP 6 S → Vst NP 2 S → S PP
1 VP → V NP 2 VP → VP PP
1 NP → Det N 2 NP → NP PP 3 NP → NP NP
0 PP → P NP
S
NP VP
VP PP
time 1 flies 2 like 3 an 4 arrow 5
0
NP 3 Vst 3
NP 10 S 8 S 13
NP 24 S 22 S 27 NP 24 S 27 S 22 S 27
1 NP 4 VP 4
NP 18 S 21 VP 18
2 P 2 V 5
PP 12 VP 16
3 Det 1 NP 10 4 N 8
1 S → NP VP 6 S → Vst NP 2 S → S PP
1 VP → V NP 2 VP → VP PP
1 NP → Det N 2 NP → NP PP 3 NP → NP NP
0 PP → P NP
S
NP VP
VP PP
P NP
time 1 flies 2 like 3 an 4 arrow 5
0
NP 3 Vst 3
NP 10 S 8 S 13
NP 24 S 22 S 27 NP 24 S 27 S 22 S 27
1 NP 4 VP 4
NP 18 S 21 VP 18
2 P 2 V 5
PP 12 VP 16
3 Det 1 NP 10 4 N 8
1 S → NP VP 6 S → Vst NP 2 S → S PP
1 VP → V NP 2 VP → VP PP
1 NP → Det N 2 NP → NP PP 3 NP → NP NP
0 PP → P NP
S
NP VP
VP PP
P NP
Det N
time 1 flies 2 like 3 an 4 arrow 5
0
NP 3 Vst 3
NP 10 S 8 S 13
NP 24 S 22 S 27 NP 24 S 27 S 22 S 27
1 NP 4 VP 4
NP 18 S 21 VP 18
2 P 2 V 5
PP 12 VP 16
3 Det 1 NP 10 4 N 8
1 S → NP VP 6 S → Vst NP 2 S → S PP
1 VP → V NP 2 VP → VP PP
1 NP → Det N 2 NP → NP PP 3 NP → NP NP
0 PP → P NP
Which entries do we need?
time 1 flies 2 like 3 an 4 arrow 5
0
NP 3 Vst 3
NP 10 S 8 S 13
NP 24 S 22 S 27 NP 24 S 27 S 22 S 27
1 NP 4 VP 4
NP 18 S 21 VP 18
2 P 2 V 5
PP 12 VP 16
3 Det 1 NP 10 4 N 8
1 S → NP VP 6 S → Vst NP 2 S → S PP
1 VP → V NP 2 VP → VP PP
1 NP → Det N 2 NP → NP PP 3 NP → NP NP
0 PP → P NP
Which entries do we need?
time 1 flies 2 like 3 an 4 arrow 5
0
NP 3 Vst 3
NP 10 S 8 S 13
NP 24 S 22 S 27 NP 24 S 27 S 22 S 27
1 NP 4 VP 4
NP 18 S 21 VP 18
2 P 2 V 5
PP 12 VP 16
3 Det 1 NP 10 4 N 8
1 S → NP VP 6 S → Vst NP 2 S → S PP
1 VP → V NP 2 VP → VP PP
1 NP → Det N 2 NP → NP PP 3 NP → NP NP
0 PP → P NP
Not worth keeping …
time 1 flies 2 like 3 an 4 arrow 5
0
NP 3 Vst 3
NP 10 S 8 S 13
NP 24 S 22 S 27 NP 24 S 27 S 22 S 27
1 NP 4 VP 4
NP 18 S 21 VP 18
2 P 2 V 5
PP 12 VP 16
3 Det 1 NP 10 4 N 8
1 S → NP VP 6 S → Vst NP 2 S → S PP
1 VP → V NP 2 VP → VP PP
1 NP → Det N 2 NP → NP PP 3 NP → NP NP
0 PP → P NP
… since it just breeds worse options
time 1 flies 2 like 3 an 4 arrow 5
0
NP 3 Vst 3
NP 10 S 8 S 13
NP 24 S 22 S 27 NP 24 S 27 S 22 S 27
1 NP 4 VP 4
NP 18 S 21 VP 18
2 P 2 V 5
PP 12 VP 16
3 Det 1 NP 10 4 N 8
1 S → NP VP 6 S → Vst NP 2 S → S PP
1 VP → V NP 2 VP → VP PP
1 NP → Det N 2 NP → NP PP 3 NP → NP NP
0 PP → P NP
Keep only best-in-class!
“inferior stock”
time 1 flies 2 like 3 an 4 arrow 5 NP 3 Vst 3
NP 10 S 8
NP 24 S 22
1 NP 4 VP 4
NP 18 S 21 VP 18
2 P 2 V 5
PP 12 VP 16
3 Det 1 NP 10 4 N 8
1 S → NP VP 6 S → Vst NP 2 S → S PP
1 VP → V NP 2 VP → VP PP
1 NP → Det N 2 NP → NP PP 3 NP → NP NP
0 PP → P NP
Keep only best-in-class! (and backpointers so you can recover parse)
Probabilistic Trees
• Instead of lightest weight tree, take highest probability tree • Given any tree, your assignment 1 generator would have some
probability of producing it! • Just like using n-grams to choose among strings … • What is the probability of this tree?
S
NP time
VP
VP flies
PP
P like
NP
Det an
N arrow
Probabilistic Trees
• Instead of lightest weight tree, take highest probability tree
• Given any tree, your assignment 1 generator would have some probability of producing it!
• Just like using n-grams to choose among strings … • What is the probability of this tree?
• You rolled a lot of independent dice …
S
NP time
VP
VP flies
PP
P like
NP
Det an
N arrow
p( | S)
Chain rule: One word at a time
p(time flies like an arrow) = p(time) * p(flies | time) * p(like | time flies) * p(an | time flies like) * p(arrow | time flies like an)
Chain rule + backoff (to get trigram model)
p(time flies like an arrow) = p(time) * p(flies | time) * p(like | time flies) * p(an | time flies like) * p(arrow | time flies like an)
Chain rule – written differently
p(time flies like an arrow) = p(time) * p(time flies | time) * p(time flies like | time flies) * p(time flies like an | time flies like) * p(time flies like an arrow | time flies like an)
Proof: p(x,y | x) = p(x | x) * p(y | x, x) = 1 * p(y | x)
Chain rule + backoff
p(time flies like an arrow) = p(time) * p(time flies | time) * p(time flies like | time flies) * p(time flies like an | time flies like) * p(time flies like an arrow | time flies like an)
Proof: p(x,y | x) = p(x | x) * p(y | x, x) = 1 * p(y | x)
Chain rule: One node at a time
S
NP time
VP
VP flies
PP
P like
NP
Det an
N arrow
p( | S) = p( S
NP VP | S) * p(
S
NP time
VP |
S
NP VP )
* p( S
NP time
VP
VP PP
| S
NP time
VP )
* p( S
NP time
VP
VP flies
PP
| S
NP time
VP ) * …
VP PP
Chain rule + backoff
S
NP time
VP
VP flies
PP
P like
NP
Det an
N arrow
p( | S) = p( S
NP VP | S) * p(
S
NP time
VP |
S
NP VP )
* p( S
NP time
VP
VP PP
| S
NP time
VP )
* p( S
NP time
VP
VP flies
PP
| S
NP time
VP ) * …
VP PP
Simplified notation
S
NP time
VP
VP flies
PP
P like
NP
Det an
N arrow
p( | S) = p(S → NP VP | S) * p(NP → flies | NP)
* p(VP → VP NP | VP)
* p(VP → flies | VP) * …
Already have a CKY alg for weights … S
NP time
VP
VP flies
PP
P like
NP
Det an
N arrow
w( | S) = w(S → NP VP) + w(NP → flies | NP)
+ w(VP → VP NP)
+ w(VP → flies) + …
Just let w(X → Y Z) = -log p(X → Y Z | X) Then lightest tree has highest prob
time 1 flies 2 like 3 an 4 arrow 5
0
NP 3 Vst 3
NP 10 S 8 S 13
NP 24 S 22 S 27 NP 24 S 27 S 22 S 27
1 NP 4 VP 4
NP 18 S 21 VP 18
2 P 2 V 5
PP 12 VP 16
3 Det 1 NP 10 4 N 8
1 S → NP VP 6 S → Vst NP 2 S → S PP
1 VP → V NP 2 VP → VP PP
1 NP → Det N 2 NP → NP PP 3 NP → NP NP
0 PP → P NP
multiply to get 2-22
2-8
2-12
2-2
time 1 flies 2 like 3 an 4 arrow 5
0
NP 3 Vst 3
NP 10 S 8 S 13
NP 24 S 22 S 27 NP 24 S 27 S 22 S 27
1 NP 4 VP 4
NP 18 S 21 VP 18
2 P 2 V 5
PP 12 VP 16
3 Det 1 NP 10 4 N 8
1 S → NP VP 6 S → Vst NP 2 S → S PP
1 VP → V NP 2 VP → VP PP
1 NP → Det N 2 NP → NP PP 3 NP → NP NP
0 PP → P NP
multiply to get 2-22
2-8
2-12
2-2 2-13
Need only best-in-class to get best parse
Why probabilities not weights? • We just saw probabilities are really just a special case
of weights … • … but we can estimate them from training data by
counting and smoothing! • Warning: What kind of training data do we need for this
type of estimation?
A slightly different task
• One task: What is probability of generating a given tree with a PCFG generator? • To pick tree with highest prob: useful in parsing.
• But could also ask: What is probability of generating a given string with the generator? • To pick string with highest prob: useful in speech
recognition, as substitute for an n-gram model. • (“Put the file in the folder” vs. “Put the file and the
folder”) • To get prob of generating string, must add up
probabilities of all trees for the string …
time 1 flies 2 like 3 an 4 arrow 5
0
NP 3 Vst 3
NP 10 S 8 S 13
NP 24 S 22 S 27 NP 24 S 27 S 22 S 27
1 NP 4 VP 4
NP 18 S 21 VP 18
2 P 2 V 5
PP 12 VP 16
3 Det 1 NP 10 4 N 8
1 S → NP VP 6 S → Vst NP 2 S → S PP
1 VP → V NP 2 VP → VP PP
1 NP → Det N 2 NP → NP PP 3 NP → NP NP
0 PP → P NP
Could just add up the parse probabilities
2-22 2-27
2-27 2-22 2-27
oops, back to finding exponentially many
parses
time 1 flies 2 like 3 an 4 arrow 5
0
NP 3 Vst 3
NP 10 S S 2-13
NP 24 S 22 S 27 NP 24 S 27 S S
1 NP 4 VP 4
NP 18 S 21 VP 18
2 P 2 V 5
PP 2-12 VP 16
3 Det 1 NP 10 4 N 8
1 S → NP VP 6 S → Vst NP 2-2 S → S PP
1 VP → V NP 2 VP → VP PP
1 NP → Det N 2 NP → NP PP 3 NP → NP NP
0 PP → P NP
Any more efficient way?
2-8
2-22
2-27
time 1 flies 2 like 3 an 4 arrow 5
0
NP 3 Vst 3
NP 10 S
NP 24 S 22 S 27 NP 24 S 27 S
1 NP 4 VP 4
NP 18 S 21 VP 18
2 P 2 V 5
PP 2-12 VP 16
3 Det 1 NP 10 4 N 8
1 S → NP VP 6 S → Vst NP 2-2 S → S PP
1 VP → V NP 2 VP → VP PP
1 NP → Det N 2 NP → NP PP 3 NP → NP NP
0 PP → P NP
Add as we go … (the “inside algorithm”)
2-8+2-13
2-22
+2-27
time 1 flies 2 like 3 an 4 arrow 5
0
NP 3 Vst 3
NP 10 S
NP
S
1 NP 4 VP 4
NP 18 S 21 VP 18
2 P 2 V 5
PP 2-12 VP 16
3 Det 1 NP 10 4 N 8
1 S → NP VP 6 S → Vst NP 2-2 S → S PP
1 VP → V NP 2 VP → VP PP
1 NP → Det N 2 NP → NP PP 3 NP → NP NP
0 PP → P NP
Add as we go … (the “inside algorithm”)
2-8+2-13
+2-22
+2-27
2-22
+2-27 2-22
+2-27 +2-27
PCFGs and HMMs
• There is a natural connection between: • the inside algorithm for filling out a probabilistic CKY
chart • and the forward algorithm for filling out an HMM trellis
• Do you see the relationship?
• However, bottom-up for PCFGs and left-to-right for HMMs are not the only ways to go for DPs • You can go outside-in for PCFGs • We’ll see left-to-right (the Earley algorithm) for PCFGs • And you can go right-to-left for HMMs
Charts and lattices
• You can equivalently represent a parse chart as a lattice constructed over some initial arcs
• This will also set the stage for Earley parsing later salt flies scratch
NP N
NP S
S
VP V N NP
S VP NP
N NP V VP salt
N NP V VP
flies scratch
N NP V VP N NP
NP S NP VP S
S
S → NP VP VP→ V NP VP→V NP→ N NP→ N N
(Speech) Lattices
• There was nothing magical about words spanning exactly one position.
• When working with speech, we generally don’t know how many words there are, or where they break.
• We can represent the possibilities as a lattice and parse these just as easily.
I awe
of
van
eyes
saw a
‘ve
an
Ivan