constituency parsingspeech.ee.ntu.edu.tw/~tlkagk/courses/dlhlp20/parsingc (v2).pdf · constituency...
TRANSCRIPT
![Page 1: Constituency Parsingspeech.ee.ntu.edu.tw/~tlkagk/courses/DLHLP20/ParsingC (v2).pdf · Constituency Parsing deep learning is very powerful •Each word is a constituent (their labels](https://reader034.vdocuments.site/reader034/viewer/2022042913/5f4a20338c4c932ba6105973/html5/thumbnails/1.jpg)
Constituency Parsing李宏毅 Hung-yi Lee
![Page 2: Constituency Parsingspeech.ee.ntu.edu.tw/~tlkagk/courses/DLHLP20/ParsingC (v2).pdf · Constituency Parsing deep learning is very powerful •Each word is a constituent (their labels](https://reader034.vdocuments.site/reader034/viewer/2022042913/5f4a20338c4c932ba6105973/html5/thumbnails/2.jpg)
One Sequence Multiple Sequences
One Class
Sentiment ClassificationStance Detection
Veracity PredictionIntent Classification
Dialogue Policy
NLISearch Engine
Relation Extraction
Class for each Token
POS taggingWord segmentation
Extractive SummarizationSlotting Filling
NER
Copyfrom Input
Extractive QA
GeneralSequence
Abstractive SummarizationTranslation
Grammar CorrectionNLG
General QAChatbot
State TrackerTask Oriented Dialogue
Other? Parsing, Coreference Resolution
![Page 3: Constituency Parsingspeech.ee.ntu.edu.tw/~tlkagk/courses/DLHLP20/ParsingC (v2).pdf · Constituency Parsing deep learning is very powerful •Each word is a constituent (their labels](https://reader034.vdocuments.site/reader034/viewer/2022042913/5f4a20338c4c932ba6105973/html5/thumbnails/3.jpg)
Constituency Parsing
• Some text spans are constituents (“units”)
• Each constituent has a label.
deep learning is very powerful
constituent constituent
not constituent
NP ADJP
constituent
![Page 4: Constituency Parsingspeech.ee.ntu.edu.tw/~tlkagk/courses/DLHLP20/ParsingC (v2).pdf · Constituency Parsing deep learning is very powerful •Each word is a constituent (their labels](https://reader034.vdocuments.site/reader034/viewer/2022042913/5f4a20338c4c932ba6105973/html5/thumbnails/4.jpg)
Constituency Parsing - Labels
+ All POS tags
![Page 5: Constituency Parsingspeech.ee.ntu.edu.tw/~tlkagk/courses/DLHLP20/ParsingC (v2).pdf · Constituency Parsing deep learning is very powerful •Each word is a constituent (their labels](https://reader034.vdocuments.site/reader034/viewer/2022042913/5f4a20338c4c932ba6105973/html5/thumbnails/5.jpg)
Constituency Parsing
deep learning is very powerful
• Each word is a constituent (their labels are POS tags)
• Some consecutive constituents can form a larger one.
VP
S
NP ADJV
Form a tree
(Only considering binary tree in this course for simplicity)
![Page 6: Constituency Parsingspeech.ee.ntu.edu.tw/~tlkagk/courses/DLHLP20/ParsingC (v2).pdf · Constituency Parsing deep learning is very powerful •Each word is a constituent (their labels](https://reader034.vdocuments.site/reader034/viewer/2022042913/5f4a20338c4c932ba6105973/html5/thumbnails/6.jpg)
Constituency Parsing
deep learning is very powerful
• Each word is a constituent (their labels are POS tags)
• Some consecutive constituents can form a larger one.
VP
S
ADJV
Form a tree
NP
Each constituent is a node.
![Page 7: Constituency Parsingspeech.ee.ntu.edu.tw/~tlkagk/courses/DLHLP20/ParsingC (v2).pdf · Constituency Parsing deep learning is very powerful •Each word is a constituent (their labels](https://reader034.vdocuments.site/reader034/viewer/2022042913/5f4a20338c4c932ba6105973/html5/thumbnails/7.jpg)
Chart-based Approach
Source of image: https://web.stanford.edu/~jurafsky/slp3/13.pdf
CKY chart parsing
![Page 8: Constituency Parsingspeech.ee.ntu.edu.tw/~tlkagk/courses/DLHLP20/ParsingC (v2).pdf · Constituency Parsing deep learning is very powerful •Each word is a constituent (their labels](https://reader034.vdocuments.site/reader034/viewer/2022042913/5f4a20338c4c932ba6105973/html5/thumbnails/8.jpg)
Chart-based
…… w2 w3 w4 w5 ……span
Constituent?
binary classification
Which label?
multi-classclassification
Classifier
![Page 9: Constituency Parsingspeech.ee.ntu.edu.tw/~tlkagk/courses/DLHLP20/ParsingC (v2).pdf · Constituency Parsing deep learning is very powerful •Each word is a constituent (their labels](https://reader034.vdocuments.site/reader034/viewer/2022042913/5f4a20338c4c932ba6105973/html5/thumbnails/9.jpg)
Chart-based
deep learning is very powerful
Constituent? Which label?
Classifier
YES
Constituent? Which label?
Classifier
NP ADJPYES
![Page 10: Constituency Parsingspeech.ee.ntu.edu.tw/~tlkagk/courses/DLHLP20/ParsingC (v2).pdf · Constituency Parsing deep learning is very powerful •Each word is a constituent (their labels](https://reader034.vdocuments.site/reader034/viewer/2022042913/5f4a20338c4c932ba6105973/html5/thumbnails/10.jpg)
Chart-based
deep learning is very powerful
Constituent? Which label?
Classifier
NO NO
Constituent? Which label?
Classifier
Don’t Care Don’t Care
![Page 11: Constituency Parsingspeech.ee.ntu.edu.tw/~tlkagk/courses/DLHLP20/ParsingC (v2).pdf · Constituency Parsing deep learning is very powerful •Each word is a constituent (their labels](https://reader034.vdocuments.site/reader034/viewer/2022042913/5f4a20338c4c932ba6105973/html5/thumbnails/11.jpg)
Constituent?
w1 w2 w3 w4 w5 w6 w7 w8 w9 w10
Span FeatureExtraction
Pre-trained Model ELMO, BERT …
Which Label?
Yes/No Label
Chart-based – Classifier
![Page 12: Constituency Parsingspeech.ee.ntu.edu.tw/~tlkagk/courses/DLHLP20/ParsingC (v2).pdf · Constituency Parsing deep learning is very powerful •Each word is a constituent (their labels](https://reader034.vdocuments.site/reader034/viewer/2022042913/5f4a20338c4c932ba6105973/html5/thumbnails/12.jpg)
Chart-based
• Given a sequence with N tokens, then run the classifier N(N-1)/2 times ……
deep learning is very powerful
Contradiction!
Constituent?
Classifier
YESConstituent?
Classifier
YES
![Page 13: Constituency Parsingspeech.ee.ntu.edu.tw/~tlkagk/courses/DLHLP20/ParsingC (v2).pdf · Constituency Parsing deep learning is very powerful •Each word is a constituent (their labels](https://reader034.vdocuments.site/reader034/viewer/2022042913/5f4a20338c4c932ba6105973/html5/thumbnails/13.jpg)
I am good
I am good
Inference Enumerate all possible trees, and use the classifier to give scores. where you need CKY algorithm
I am good
I am good
Classifier
0.1
Classifier
0.9
Classifier
0.8
Classifier
0.9
Training?[Stern, et al., ACL’17]
![Page 14: Constituency Parsingspeech.ee.ntu.edu.tw/~tlkagk/courses/DLHLP20/ParsingC (v2).pdf · Constituency Parsing deep learning is very powerful •Each word is a constituent (their labels](https://reader034.vdocuments.site/reader034/viewer/2022042913/5f4a20338c4c932ba6105973/html5/thumbnails/14.jpg)
Transition-based
Source of image: https://arxiv.org/pdf/1602.07776.pdf
![Page 15: Constituency Parsingspeech.ee.ntu.edu.tw/~tlkagk/courses/DLHLP20/ParsingC (v2).pdf · Constituency Parsing deep learning is very powerful •Each word is a constituent (their labels](https://reader034.vdocuments.site/reader034/viewer/2022042913/5f4a20338c4c932ba6105973/html5/thumbnails/15.jpg)
Transition-based
Stack
Buffer
(empty at the beginning)
SHIFT REDUCECREATE(X)
More a token from buffer to stack
Close a constituteCreate a constitute X
(X = NP, VP …)
deep learning is very powerful
[Dyer, et al., NAACL’16]
Actions
![Page 16: Constituency Parsingspeech.ee.ntu.edu.tw/~tlkagk/courses/DLHLP20/ParsingC (v2).pdf · Constituency Parsing deep learning is very powerful •Each word is a constituent (their labels](https://reader034.vdocuments.site/reader034/viewer/2022042913/5f4a20338c4c932ba6105973/html5/thumbnails/16.jpg)
Transition-based
deep learning is very powerful
CREATE(S)
(empty at the beginning)
![Page 17: Constituency Parsingspeech.ee.ntu.edu.tw/~tlkagk/courses/DLHLP20/ParsingC (v2).pdf · Constituency Parsing deep learning is very powerful •Each word is a constituent (their labels](https://reader034.vdocuments.site/reader034/viewer/2022042913/5f4a20338c4c932ba6105973/html5/thumbnails/17.jpg)
Transition-based
deep learning is very powerful
CREATE(S)
(S
CREATE(NP)
(NP
SHIFT
deep
SHIFT
learning
REDUCE
)
CREATE(VP)
(VP
SHIFT
is
CREATE(ADJV)
(ADJV
SHIFT SHIFT
very powerful ) ) )
REDUCE REDUCE REDUCE
![Page 18: Constituency Parsingspeech.ee.ntu.edu.tw/~tlkagk/courses/DLHLP20/ParsingC (v2).pdf · Constituency Parsing deep learning is very powerful •Each word is a constituent (their labels](https://reader034.vdocuments.site/reader034/viewer/2022042913/5f4a20338c4c932ba6105973/html5/thumbnails/18.jpg)
RNN Grammar
Stack Buffer
Previous actions
RNN RNN
RNN
Network
SHIFT REDUCECREATE(X)
![Page 19: Constituency Parsingspeech.ee.ntu.edu.tw/~tlkagk/courses/DLHLP20/ParsingC (v2).pdf · Constituency Parsing deep learning is very powerful •Each word is a constituent (their labels](https://reader034.vdocuments.site/reader034/viewer/2022042913/5f4a20338c4c932ba6105973/html5/thumbnails/19.jpg)
deep learning is very powerful
CREATE(S)
(S
CREATE(NP)
(NP
SHIFT
deep
SHIFT
learning
REDUCE
Network
• typical classification task • RL is not needed
RNN Grammar – Training
Ground truth
RNN
RNN
RNN
![Page 20: Constituency Parsingspeech.ee.ntu.edu.tw/~tlkagk/courses/DLHLP20/ParsingC (v2).pdf · Constituency Parsing deep learning is very powerful •Each word is a constituent (their labels](https://reader034.vdocuments.site/reader034/viewer/2022042913/5f4a20338c4c932ba6105973/html5/thumbnails/20.jpg)
Source of image: https://papers.nips.cc/paper/5635-grammar-as-a-foreign-language.pdf
[Vinyals, et al., NIPS’15]
![Page 21: Constituency Parsingspeech.ee.ntu.edu.tw/~tlkagk/courses/DLHLP20/ParsingC (v2).pdf · Constituency Parsing deep learning is very powerful •Each word is a constituent (their labels](https://reader034.vdocuments.site/reader034/viewer/2022042913/5f4a20338c4c932ba6105973/html5/thumbnails/21.jpg)
deep learning is very powerful
VP
S
NP ADJV
Tree to Sequence
(S (NP deep learning )
1
2
3 4
5
6
7
8
9 10
11
12
13
(VP is
(ADJV very powerful ) ) )
Of course, you can try other tree traversal approaches
[Liu, et al., TACL’17]
Seq2seq!
![Page 22: Constituency Parsingspeech.ee.ntu.edu.tw/~tlkagk/courses/DLHLP20/ParsingC (v2).pdf · Constituency Parsing deep learning is very powerful •Each word is a constituent (their labels](https://reader034.vdocuments.site/reader034/viewer/2022042913/5f4a20338c4c932ba6105973/html5/thumbnails/22.jpg)
deep learning is very powerful
(S (NP deep learning ) (VP is
(ADJV very powerful ) ) )
CREATE(S) CREATE(NP) SHIFT SHIFT REDUCE
CREATE(VP) SHIFT CREATE(ADJV) SHIFT SHIFT
REDUCE REDUCE REDUCE
[Vinyals, et al., NIPS’15]
[Dyer, et al., NAACL’16]
Seq2seq v.s. RNN grammar
![Page 23: Constituency Parsingspeech.ee.ntu.edu.tw/~tlkagk/courses/DLHLP20/ParsingC (v2).pdf · Constituency Parsing deep learning is very powerful •Each word is a constituent (their labels](https://reader034.vdocuments.site/reader034/viewer/2022042913/5f4a20338c4c932ba6105973/html5/thumbnails/23.jpg)
Unsupervised Parsing?
deep learning is very powerful
Can we find parsing trees without label data?
YES!
Reference: https://youtu.be/YIuBHB9Ejok
![Page 24: Constituency Parsingspeech.ee.ntu.edu.tw/~tlkagk/courses/DLHLP20/ParsingC (v2).pdf · Constituency Parsing deep learning is very powerful •Each word is a constituent (their labels](https://reader034.vdocuments.site/reader034/viewer/2022042913/5f4a20338c4c932ba6105973/html5/thumbnails/24.jpg)
Reference
• [Vinyals, et al., NIPS’15] Oriol Vinyals, Lukasz Kaiser, Terry Koo, Slav Petrov, Ilya Sutskever, Geoffrey Hinton, Grammar as a foreign language, NIPS, 2015
• [Dyer, et al., NAACL’16] Chris Dyer, Adhiguna Kuncoro, Miguel Ballesteros, Noah A. Smith, Recurrent Neural Network Grammars, NAACL, 2016
• [Stern, et al., ACL’17] Mitchell Stern, Jacob Andreas, Dan Klein, A Minimal Span-Based Neural Constituency Parser, ACL,2017
• [Liu, et al., TACL’17] Jiangming Liu, Yue Zhang, In-Order Transition-based Constituent Parsing, TACL, 2017