lecture 15: dependency parsing - computer sciencekc2wc/teaching/nlp16/slides/15-dp.pdf · lecture...
Post on 04-Jun-2018
223 Views
Preview:
TRANSCRIPT
Lecture 15: Dependency Parsing
Kai-Wei ChangCS @ University of Virginia
kw@kwchang.net
Couse webpage: http://kwchang.net/teaching/NLP16
1CS6501: NLP
Dependency trees
vDependency grammar describe the structure of sentences as a graph (tree)vNodes represent wordsvEdges represent dependencies
v Idea goes back to 4th century BCin ancient India
CS6501: NLP 3
Phrase structure (constituent parse) trees
v Can be modeled by Context-free grammarsv We will see how constituent parse and
dependency parse are related
CS6501: NLP 4
Context-free grammars
CS6501: NLP 5
PP→ PNPPP→ PDTNPP→ inthegarden
Non-terminal:DT,N,P,NP,PP,…Terminal:the,a,ball,garden
Constituent Parse
CS6501: NLP 12
Non-terminal
Terminal
Sà NPVPNPà DTNNNPà DTPP à INNPVPà VBDPPVPà NPVBDNPVPà NPVB
Probabilistic Context-free Grammar
CS6501: NLP 14
Non-terminal
Terminal
1.0Sà NPVP0.6NPà DTNN0.4NPà NPPP1.0PPà INNP0.5VPà VBDPP0.2VPà NPVBDNP0.3VPà NPVB
Probabilistic Context-free Grammar
vPCFG achieves ~73% on PTBvState-of-the art ~92%
v Lexicalized PCFG(Collins 1997)
CS6501: NLP 15
How to decide head?
vUsually use deterministic head rules (e.g., Collins head rules)
vDefine heads in CFGvS → NP VPvVP → VBD NP PPvNP → DT JJ NN
CS6501: NLP 17
FromNoahSmith
Dependency parsing
vCan be more flexible (non-projective)vEnglish are mostly projective
vSome free word order languages (e.g., Czech) are non-projective
CS6501: NLP 23
How to build a dependency tree?
vThere are several approachesvGraph Algorithms
v Consider all word pairsv Create a Maximum Spanning Tree for a sentence
v Transition-base Approachesv Similar to how we parse a program:
Shift-Reduce Parser
v Many other approaches…
CS6501: NLP 24
Sources of information for DP
v Lexical affinitiesv [ issues → the ]v [ issues → I ]
vDistances vWords usually depend on nearby words
vValency of headsv# dependents for a head
CS6501: NLP 25
Graph-Based Approaches[McDonald et al. 2005]
vConsider all word pairs and assign scoresvScore of a tree = sum of score of edgesvCan be solve as a MST problem
vChu-Liu-Edmonds
CS6501: NLP 26
Transition-based parser
vMaltParser (Nivre et al. 2008)vSimilar to a Shift-Reduce Parser
vBut “reduce” actions can create dependencies
vThe parser has:vA stack 𝜎– starts with a “Root” symbolvA buffer 𝛽– starts with the input sentencevA set of dependency arcs A– starts off empty
vUse a set of actions to parse sentencesvMany possible action sets
CS6501: NLP 27
Arc-Eager Dependency Parser
vShift:
v Left-Arc:
CS6501: NLP 28
ROOTJoe likesMary Joe
ROOTlikesMary→
JoeROOT
likesMary →ROOT
likesMary
Joe
Precondition:𝑤% ≠ Root&(𝑤(,𝑤%) ∉ 𝐴
Arc-Eager Dependency Parser
vRight-Arc
vReduce:
CS6501: NLP 29
likeROOT
MaryMarylikeROOT→
Joe Joe
MarylikeROOT
Joe
→ likeROOT
Joe
Precondition:(𝑤(, 𝑤%) ∈ 𝐴Mary
Arc-Eager Dependency Parser
vStart:vConduct a sequence of actions
vTerminate with 𝜎, 𝛽 = ∅
CS6501: NLP 30
It’s your turn
vHappy children like to play with their friend .vShift→ Left-arc→ Shift→ Left-arc →Right-arc→ Shift →Left-arc →Right-arc →Right-arc → Shift →Left-arc →Right-arc →Reduce*3→Right-arc →Reduce*3
CS6501: NLP 48
top related