![Page 1: Yang Liu State Key Laboratory of Intelligent Technology and Systems Tsinghua National Laboratory for Information Science and Technology Department of Computer](https://reader036.vdocuments.site/reader036/viewer/2022062421/56649f435503460f94c63ccf/html5/thumbnails/1.jpg)
Yang Liu
State Key Laboratory of Intelligent Technology and Systems
Tsinghua National Laboratory for Information Science and Technology
Department of Computer Science and TechnologyTsinghua University, Beijing 100084, China
ACL 2013
![Page 2: Yang Liu State Key Laboratory of Intelligent Technology and Systems Tsinghua National Laboratory for Information Science and Technology Department of Computer](https://reader036.vdocuments.site/reader036/viewer/2022062421/56649f435503460f94c63ccf/html5/thumbnails/2.jpg)
Introduction
目前的 statistical machine translation approach 大致上分為兩類 phrase-based syntax-based
提出 shift-reduce parsing algorithm 來整合兩類的優點
翻譯的對象是 string-to-dependency phrase pair
利用 maximum entropy model 來解決conflicts 的問題
![Page 3: Yang Liu State Key Laboratory of Intelligent Technology and Systems Tsinghua National Laboratory for Information Science and Technology Department of Computer](https://reader036.vdocuments.site/reader036/viewer/2022062421/56649f435503460f94c63ccf/html5/thumbnails/3.jpg)
Introduction
datasets: 使用 NIST Chinese-English translation datasets
evaluation : BLEU & TER , 並與phrase-based 和 syntax-based 結果相比較
![Page 4: Yang Liu State Key Laboratory of Intelligent Technology and Systems Tsinghua National Laboratory for Information Science and Technology Department of Computer](https://reader036.vdocuments.site/reader036/viewer/2022062421/56649f435503460f94c63ccf/html5/thumbnails/4.jpg)
Shift-Reduce Parsing for Phrase-based
String-to-Dependency Translation
Example:zongtong jiang yu siyue lai lundun
fangwen
The President will visit London in April
GIZA++
Context free grammar parser
![Page 5: Yang Liu State Key Laboratory of Intelligent Technology and Systems Tsinghua National Laboratory for Information Science and Technology Department of Computer](https://reader036.vdocuments.site/reader036/viewer/2022062421/56649f435503460f94c63ccf/html5/thumbnails/5.jpg)
Shift-Reduce Parsing for Phrase-based
String-to-Dependency Translation
Two broad categories: well-formed:
fixed floating – (left or right ,according to
position of head) ill-formedsource phrase target phrase dependen
cycategory
r1r2r3r4r5
fangwenyu siyue
zongtone jiangyu siyue lai
lundunzongtone jiang
visitin April
The President will
London in AprilPresident will
{}{1 2}{2 1}{2 3}
{}
fixedfixed
floating left
floating right
ill-formed
![Page 6: Yang Liu State Key Laboratory of Intelligent Technology and Systems Tsinghua National Laboratory for Information Science and Technology Department of Computer](https://reader036.vdocuments.site/reader036/viewer/2022062421/56649f435503460f94c63ccf/html5/thumbnails/6.jpg)
shift-reduce algorithm - exampletuple
<S,C>
從 empty state開始
terminate: 當所有 source words 都被翻譯且 stack 內有完整的 dependency tree 時
![Page 7: Yang Liu State Key Laboratory of Intelligent Technology and Systems Tsinghua National Laboratory for Information Science and Technology Department of Computer](https://reader036.vdocuments.site/reader036/viewer/2022062421/56649f435503460f94c63ccf/html5/thumbnails/7.jpg)
![Page 8: Yang Liu State Key Laboratory of Intelligent Technology and Systems Tsinghua National Laboratory for Information Science and Technology Department of Computer](https://reader036.vdocuments.site/reader036/viewer/2022062421/56649f435503460f94c63ccf/html5/thumbnails/8.jpg)
A Maximum Entropy Based Shift-Reduce Parsing Model
h : fixed l : left floating r : right
floating
![Page 9: Yang Liu State Key Laboratory of Intelligent Technology and Systems Tsinghua National Laboratory for Information Science and Technology Department of Computer](https://reader036.vdocuments.site/reader036/viewer/2022062421/56649f435503460f94c63ccf/html5/thumbnails/9.jpg)
A Maximum Entropy Based Shift-Reduce Parsing Model
maximum entropy model:
a ∈ {S , Rl , Rr} c : 為 boolean 值表示是否所有的 source
words 都 covered h(a, c, st, st-1) : vector of binary features Ѳ: vector of feature weights
![Page 10: Yang Liu State Key Laboratory of Intelligent Technology and Systems Tsinghua National Laboratory for Information Science and Technology Department of Computer](https://reader036.vdocuments.site/reader036/viewer/2022062421/56649f435503460f94c63ccf/html5/thumbnails/10.jpg)
A Maximum Entropy Based Shift-Reduce Parsing Model
![Page 11: Yang Liu State Key Laboratory of Intelligent Technology and Systems Tsinghua National Laboratory for Information Science and Technology Department of Computer](https://reader036.vdocuments.site/reader036/viewer/2022062421/56649f435503460f94c63ccf/html5/thumbnails/11.jpg)
A Maximum Entropy Based Shift-Reduce Parsing Model
為了 train model, 我們需要每個training example gold-standard action sequence
To alleviate this problem : derivation graph
![Page 12: Yang Liu State Key Laboratory of Intelligent Technology and Systems Tsinghua National Laboratory for Information Science and Technology Department of Computer](https://reader036.vdocuments.site/reader036/viewer/2022062421/56649f435503460f94c63ccf/html5/thumbnails/12.jpg)
![Page 13: Yang Liu State Key Laboratory of Intelligent Technology and Systems Tsinghua National Laboratory for Information Science and Technology Department of Computer](https://reader036.vdocuments.site/reader036/viewer/2022062421/56649f435503460f94c63ccf/html5/thumbnails/13.jpg)
Decoding
linear model with the following features: standard features
relative frequencies in two directions lexical weights in two directions phrase penalty distance-based reordering model lexicaized reordering model n-gram language model model word penalty
![Page 14: Yang Liu State Key Laboratory of Intelligent Technology and Systems Tsinghua National Laboratory for Information Science and Technology Department of Computer](https://reader036.vdocuments.site/reader036/viewer/2022062421/56649f435503460f94c63ccf/html5/thumbnails/14.jpg)
Decoding (continue)
dependency features: ill-formed structure penalty dependency language model maximum entropy parsing model
![Page 15: Yang Liu State Key Laboratory of Intelligent Technology and Systems Tsinghua National Laboratory for Information Science and Technology Department of Computer](https://reader036.vdocuments.site/reader036/viewer/2022062421/56649f435503460f94c63ccf/html5/thumbnails/15.jpg)
Decoding
![Page 16: Yang Liu State Key Laboratory of Intelligent Technology and Systems Tsinghua National Laboratory for Information Science and Technology Department of Computer](https://reader036.vdocuments.site/reader036/viewer/2022062421/56649f435503460f94c63ccf/html5/thumbnails/16.jpg)
Decoding
在 decoding 的過程中 ,stack 內的context information 會不斷變動(dependency language model and maximum entropy model probabilities)
使用 hypergraph reranking (Huang and Chiang, 2007; Huang, 2008) divided into two part
![Page 17: Yang Liu State Key Laboratory of Intelligent Technology and Systems Tsinghua National Laboratory for Information Science and Technology Department of Computer](https://reader036.vdocuments.site/reader036/viewer/2022062421/56649f435503460f94c63ccf/html5/thumbnails/17.jpg)
Decoding
為了提高 rule coverage, 使用 Shen et al. (2008) 的 ill-formed structures
如果 : ill-formed structure 有單一個 root : 當作
(pseudo) fixed structure 其他的 ill-formed structure 拆成一個
(pseudo) left floating structure 和一個(pseudo) right floating structure
![Page 18: Yang Liu State Key Laboratory of Intelligent Technology and Systems Tsinghua National Laboratory for Information Science and Technology Department of Computer](https://reader036.vdocuments.site/reader036/viewer/2022062421/56649f435503460f94c63ccf/html5/thumbnails/18.jpg)
Experiments
evaluated on Chinese-English translation
training data : 2.9M 個 sentence pairs,包含 76.0M Chinese words 和 82.2M English words
development set : 2002 NIST MT Chinese-English dataset
test sets: 2003-2005 NIST datasets
![Page 19: Yang Liu State Key Laboratory of Intelligent Technology and Systems Tsinghua National Laboratory for Information Science and Technology Department of Computer](https://reader036.vdocuments.site/reader036/viewer/2022062421/56649f435503460f94c63ccf/html5/thumbnails/19.jpg)
Experiments
用 Stanford parser 得到 English sentence 的 dependency trees
train a 4-gram language model on the Xinhua portion of the GIGAWORD corpus, which contains 238M English words
train a 3-gram dependency language model was trained on the English dependency trees
![Page 20: Yang Liu State Key Laboratory of Intelligent Technology and Systems Tsinghua National Laboratory for Information Science and Technology Department of Computer](https://reader036.vdocuments.site/reader036/viewer/2022062421/56649f435503460f94c63ccf/html5/thumbnails/20.jpg)
Experiments
compare with: The Moses phrase-based decoder
(Koehn et al., 2007) A re-implementation of bottom-up string-
to-dependency decoder (Shen et al., 2008)
b limit : 100 pharse table limit : 20
![Page 21: Yang Liu State Key Laboratory of Intelligent Technology and Systems Tsinghua National Laboratory for Information Science and Technology Department of Computer](https://reader036.vdocuments.site/reader036/viewer/2022062421/56649f435503460f94c63ccf/html5/thumbnails/21.jpg)
Experiments
Moses shares the same feature set with our system except for the dependency features.
For the bottom-up string-to-dependency system, we included both well-formed and ill-formed structures in chart parsing.
![Page 22: Yang Liu State Key Laboratory of Intelligent Technology and Systems Tsinghua National Laboratory for Information Science and Technology Department of Computer](https://reader036.vdocuments.site/reader036/viewer/2022062421/56649f435503460f94c63ccf/html5/thumbnails/22.jpg)
Experiments
Moses dependency
This work
Rule number 103M 587M 124M
avg. decoding time(per sentence)
3.67 s 13.89 s 4.56 s
![Page 23: Yang Liu State Key Laboratory of Intelligent Technology and Systems Tsinghua National Laboratory for Information Science and Technology Department of Computer](https://reader036.vdocuments.site/reader036/viewer/2022062421/56649f435503460f94c63ccf/html5/thumbnails/23.jpg)
Experiments
![Page 24: Yang Liu State Key Laboratory of Intelligent Technology and Systems Tsinghua National Laboratory for Information Science and Technology Department of Computer](https://reader036.vdocuments.site/reader036/viewer/2022062421/56649f435503460f94c63ccf/html5/thumbnails/24.jpg)
Experiments
![Page 25: Yang Liu State Key Laboratory of Intelligent Technology and Systems Tsinghua National Laboratory for Information Science and Technology Department of Computer](https://reader036.vdocuments.site/reader036/viewer/2022062421/56649f435503460f94c63ccf/html5/thumbnails/25.jpg)
Conclusion
提出 shift-reduce parsing algorithm for phrase-based string-to-dependency translation, 這個方法能整合 phrase-based和 string-to-dependency model 的優點 ,並在 Chinese-to-English translation 的實驗結果 ,outperform 兩個 baseline(phrase-based , syntax-based)
![Page 26: Yang Liu State Key Laboratory of Intelligent Technology and Systems Tsinghua National Laboratory for Information Science and Technology Department of Computer](https://reader036.vdocuments.site/reader036/viewer/2022062421/56649f435503460f94c63ccf/html5/thumbnails/26.jpg)
Future work
在 maximum entropy model 中增加更多的contextual information 來解決 conflicts的問題 , 另一方面 , 修改 Huang and Sagae (2010) 提出的 dynamic programming algorithm 來提高 string-to-dependency decoder 的效果