backward machine transliteration by learning phonetic similarity

Intelligent Database Systems Lab

國立雲林科技大學National Yunlin University of Science and Technology

Advisor ： Dr. Hsu

Presenter ： Chien Shing Chen

Author: Wei-Hao Lin and Hsin-His Chen

Backward Machine Transliteration by Learning Phonetic Similarity

PRESENTED AT SIXTH CONFERENCE ON NATURAL LANGUAGE LEARNING, TAIPEI, TAIWAN,2002


N.Y.U.S.T.

I. M.Outline

Motivation Objective Introduction Grapheme-to-Phoneme(音素 ,音位 ) Transformation Similarity Measurement Learning Phonetic Similarity Experimental Result Conclusions Personal Opinion


N.Y.U.S.T.

I. M.Motivation

a similarity-based framework to model the task of backward transliteration

a learning algorithm to automatically acquire phonetic similarities from a corpus

Backward transliteration: from a transliteration to original language, like “ 本拉登” =>Bin Laden


N.Y.U.S.T.

I. M.Objective

Backward machine transliteration by learning phonetic similarity

雨果 (Yu-guo) => Hugo


N.Y.U.S.T.

I. M.Introduction


N.Y.U.S.T.

I. M.Introduction

IPA : International Phonetic Alphabet( 國際音標 )Yu-guo =>h j u g oU

Hugo =>v k uo

Similarity Measurement


N.Y.U.S.T.

I. M.Introduction

CMU pronunciation dictionary 0.6 版ftp://ftp.cs.cmu.edu/project/fgdata/dict


N.Y.U.S.T.

I. M.Similarity Measurement-alignment

Set is the alphabet set of two strings S1 and S2. ,where ‘_’ stands for space.

Space can be inserted into S1’ and S2’

S1’ and S2’ are aligned


N.Y.U.S.T.

I. M.Similarity Measurement-score

<English,Chinese> <Hugo, Yu3-guo3>

the phoneme pair (v k uo, h j u g oU)

={h, j, u, v, g, k, oU, uo, _}


N.Y.U.S.T.

I. M.Similarity Measurement-score

={h, j, u, v, g, k, oU, uo, _}


N.Y.U.S.T.

I. M.Similarity Measurement-Dynamic

Dynamic programming to trade off :alignment

similarity scoring matrix M

OPTIMALS1 (j h u g oU)

S2 (v k uo)


N.Y.U.S.T.

I. M.Dynamic programming-Dynamic

Set T is a n+1 by m+1 table where n is the length S1, m is the length of S2.


N.Y.U.S.T.

I. M.Learning Phonetic Similarity

develop a learning algorithm to remove the efforts of assigning scores in the matrix

capture the subtle difference

How to prepare a training corpus, followed by the learning algorithm.


N.Y.U.S.T.

I. M.Learning Phonetic Similarity

Positive pairs: original words and the transliterated words are matched

Negative pairs: mismatch the original words and the transliterated words

Ei: original English

Ci: transliterated Chinese

Corpus with n pairs

克林頓

本拉登

魯賓遜

Clinton

Bin Laden

Robinson n positive pairn (n-1) negative pair


N.Y.U.S.T.

I. M.Learning Algorithm

Treat each training sample as a linear equation

m is the size of the phoneme sets, m=9

wi,j is the row i and the column j of the scoring matrix

xi,j is a binary value indicating the presence of wi,j in the alignmenty is the similarity score.


N.Y.U.S.T.


Linear equation in the corpus can be conveniently represented in the matrix form,

, R is the number of pairs in the corpus

i stands for the ith sample pair in the corpus

•wi,j is the scoring matrix•xi,j is a binary value•y is the similarity score


N.Y.U.S.T.


The criterion is the sum-of-squared error minimized.

The classical solution is to take the pseudo inverse of , i.e. ,to obtain the w that minimizes the SSE , i.e.

adopt the Widrow-Hoff rule to solve


N.Y.U.S.T.


k stands for the kth row in the matrix X

i for the number of iterations

is the learning rate

is the momentum coefficient.

is empirically set as

as

follows,


N.Y.U.S.T.


The w(i) is updated iteratively until the learned w appears to overfit.

The iterations to ensure the w will converge to a vector satisfying

Update w(i) immediately after encountering a new training sample instead of accumulating all errors of training samples

The other speed-up technique is the momentum used to damp the oscillations. .


N.Y.U.S.T.

I. M.Experiments

.corpus is consisted of 1574 pairs of <English,Chinese> names

313 have no entries in the pronouncing dictionary.

97 phonemes used to represent these names, in which 59 and 51 phonemes are used for Chinese and English names.

Rank is the position of the correct original word in a list of candidate words sorted.


N.Y.U.S.T.

I. M.Experiments

.


N.Y.U.S.T.

I. M.Conclusions

Without any phonological analysis, the learning algorithm can acquire those similarities without human intervention.


N.Y.U.S.T.

I. M.Personal Opinion

Drawbackobtain the score matrix depend on a few empirically rule

Is the experiment tie in with the testing samples ?

ApplicationA different method to compute the similarity between words.

Future WorkThe Widrow-Hoff rule may estimate the parameter to substitute for attempting intervention blinded.

Combine sound speech recognize with this method to output a new objectivity method

backward machine transliteration by learning phonetic similarity

Documents