university of alberta letter-to-phoneme conversion sittichai jiampojamarn [email protected] cmput...
Post on 20-Dec-2015
221 views
TRANSCRIPT
![Page 1: University of Alberta Letter-to-phoneme conversion Sittichai Jiampojamarn sj@cs.ualberta.ca CMPUT 500 / HUCO 612 September 26, 2007](https://reader036.vdocuments.site/reader036/viewer/2022062714/56649d4c5503460f94a2a21a/html5/thumbnails/1.jpg)
University of Alberta
Letter-to-phoneme conversion
Sittichai [email protected]
CMPUT 500 / HUCO 612September 26, 2007
![Page 2: University of Alberta Letter-to-phoneme conversion Sittichai Jiampojamarn sj@cs.ualberta.ca CMPUT 500 / HUCO 612 September 26, 2007](https://reader036.vdocuments.site/reader036/viewer/2022062714/56649d4c5503460f94a2a21a/html5/thumbnails/2.jpg)
University of Alberta
Outline• Part I
– Introduction to letter-phoneme conversion
• Part II– Many-to-Many alignments and Hidden Markov Models to Letter-
to-phoneme conversion., NAACL 2007
• Part III– On-going work: discriminative approaches for letter-to-phoneme
conversion
• Part IV– Possible term projects for CMPUT 500 / HUGO 612
![Page 3: University of Alberta Letter-to-phoneme conversion Sittichai Jiampojamarn sj@cs.ualberta.ca CMPUT 500 / HUCO 612 September 26, 2007](https://reader036.vdocuments.site/reader036/viewer/2022062714/56649d4c5503460f94a2a21a/html5/thumbnails/3.jpg)
University of Alberta
The task
• Converting words to their pronunciations– study -> [ s t ʌ d I ]– band -> [b æ n d ] – phoenix -> [ f i n I k s ]– king -> [ k I ŋ ]
• Words sequences of letters.• Pronunciations sequence of phonemes.
– Ignoring syllabifications, and stresses.
![Page 4: University of Alberta Letter-to-phoneme conversion Sittichai Jiampojamarn sj@cs.ualberta.ca CMPUT 500 / HUCO 612 September 26, 2007](https://reader036.vdocuments.site/reader036/viewer/2022062714/56649d4c5503460f94a2a21a/html5/thumbnails/4.jpg)
University of Alberta
Why is it important?• Major component in speech synthesis systems
• Word similarity based on pronunciation– Spelling correction. (Toutanova and Moore, 2001)
• Linguistic interest of relationships between letters and phonemes.
• Not a trivial task, but tractable.
![Page 5: University of Alberta Letter-to-phoneme conversion Sittichai Jiampojamarn sj@cs.ualberta.ca CMPUT 500 / HUCO 612 September 26, 2007](https://reader036.vdocuments.site/reader036/viewer/2022062714/56649d4c5503460f94a2a21a/html5/thumbnails/5.jpg)
University of Alberta
Trivial solutions ?
• Dictionary – searching answers on database– Great effort to construct such large lexicon database.– Can’t handle new words and misspellings.
• Rule-based approaches– Work well on non-complex languages– Fail on complex languages
• Each word creates its own rules. --- end up with remembering word-phoneme pairs.
![Page 6: University of Alberta Letter-to-phoneme conversion Sittichai Jiampojamarn sj@cs.ualberta.ca CMPUT 500 / HUCO 612 September 26, 2007](https://reader036.vdocuments.site/reader036/viewer/2022062714/56649d4c5503460f94a2a21a/html5/thumbnails/6.jpg)
University of Alberta
John Kominek and Alan W. Black, “Learning Pronunciation Dictionaries: Language Complexity and Word Selection Strategies”, In proceeding of HLT-NAACL 2006, June 4-9, pp.232-239
![Page 7: University of Alberta Letter-to-phoneme conversion Sittichai Jiampojamarn sj@cs.ualberta.ca CMPUT 500 / HUCO 612 September 26, 2007](https://reader036.vdocuments.site/reader036/viewer/2022062714/56649d4c5503460f94a2a21a/html5/thumbnails/7.jpg)
University of Alberta
Learning-based approaches
• Training data– Examples of words and their phonemes.
• Hidden structure– band [b æ n d ]
• b [b], a [æ], n [n], d [d]
– abode [ə b o d]• a [ə], b [b], o [o], d [d], e [ _ ]
![Page 8: University of Alberta Letter-to-phoneme conversion Sittichai Jiampojamarn sj@cs.ualberta.ca CMPUT 500 / HUCO 612 September 26, 2007](https://reader036.vdocuments.site/reader036/viewer/2022062714/56649d4c5503460f94a2a21a/html5/thumbnails/8.jpg)
University of Alberta
Alignments
• To train L2P, we need alignments between letters and phonemes
a -> [ə]b -> [b]o -> [o]d -> [d]e -> [_]
a b o d e
ə b o d _
![Page 9: University of Alberta Letter-to-phoneme conversion Sittichai Jiampojamarn sj@cs.ualberta.ca CMPUT 500 / HUCO 612 September 26, 2007](https://reader036.vdocuments.site/reader036/viewer/2022062714/56649d4c5503460f94a2a21a/html5/thumbnails/9.jpg)
University of Alberta
Overview standard process
Training data
1-1 alignerAligned
dataPhoneme prediction
pronunciation
![Page 10: University of Alberta Letter-to-phoneme conversion Sittichai Jiampojamarn sj@cs.ualberta.ca CMPUT 500 / HUCO 612 September 26, 2007](https://reader036.vdocuments.site/reader036/viewer/2022062714/56649d4c5503460f94a2a21a/html5/thumbnails/10.jpg)
University of Alberta
Letter-to-phoneme alignments
• Previous work assumed one-to-one alignment for simplicity (Daelemans and Bosch, 1997; Black et al., 1998; Damper et al., 2005).
• Expectation-Maximization (EM) algorithms are used to optimize the alignment parameters.
• Matching all possible letters and phonemes iteratively until the parameters converge.
![Page 11: University of Alberta Letter-to-phoneme conversion Sittichai Jiampojamarn sj@cs.ualberta.ca CMPUT 500 / HUCO 612 September 26, 2007](https://reader036.vdocuments.site/reader036/viewer/2022062714/56649d4c5503460f94a2a21a/html5/thumbnails/11.jpg)
University of Alberta
1-to-1 alignments• Initially, alignments parameters can start from uniform
distribution, or counting all possible letter-phoneme mapping. Ex. abode [ə b o d]
a b o d e
ə b o d_
a b o d e
ə b o d_
a b o d e
ə b o d_
a b o d e
ə b o d_
P(a, ə) = 4/5P(b,b) = 3/5…
a b o d e
ə b o d _
![Page 12: University of Alberta Letter-to-phoneme conversion Sittichai Jiampojamarn sj@cs.ualberta.ca CMPUT 500 / HUCO 612 September 26, 2007](https://reader036.vdocuments.site/reader036/viewer/2022062714/56649d4c5503460f94a2a21a/html5/thumbnails/12.jpg)
University of Alberta
1-to-1 alignments• Find the best possible alignments based on current
alignment parameters.
a b o d e
ə b o d _
• Based on the alignments found, update the parameters.
![Page 13: University of Alberta Letter-to-phoneme conversion Sittichai Jiampojamarn sj@cs.ualberta.ca CMPUT 500 / HUCO 612 September 26, 2007](https://reader036.vdocuments.site/reader036/viewer/2022062714/56649d4c5503460f94a2a21a/html5/thumbnails/13.jpg)
University of Alberta
Finding the best possible alignments
• Dynamic programming:– Standard weighted minimum edit distance algorithm style.
– Consider the alignment parameter P(l,p) is a mapping score component.
– Try to find alignments which give the maximum score.
– Allow to have null phonemes but not null letters• It is hard to incorporate null letters in the testing data
![Page 14: University of Alberta Letter-to-phoneme conversion Sittichai Jiampojamarn sj@cs.ualberta.ca CMPUT 500 / HUCO 612 September 26, 2007](https://reader036.vdocuments.site/reader036/viewer/2022062714/56649d4c5503460f94a2a21a/html5/thumbnails/14.jpg)
University of Alberta
Visualizationa b o d e
_ b o də
_ b o də
_b o də
_b o də
_b o də
# a b o d e
#
ə
b
o
d
![Page 15: University of Alberta Letter-to-phoneme conversion Sittichai Jiampojamarn sj@cs.ualberta.ca CMPUT 500 / HUCO 612 September 26, 2007](https://reader036.vdocuments.site/reader036/viewer/2022062714/56649d4c5503460f94a2a21a/html5/thumbnails/15.jpg)
University of Alberta
Visualizationa b o d e
_ b o də
_ b o də
_b o də
_b o də
_b o də
# a b o d e
#
ə
b
o
d
![Page 16: University of Alberta Letter-to-phoneme conversion Sittichai Jiampojamarn sj@cs.ualberta.ca CMPUT 500 / HUCO 612 September 26, 2007](https://reader036.vdocuments.site/reader036/viewer/2022062714/56649d4c5503460f94a2a21a/html5/thumbnails/16.jpg)
University of Alberta
Visualizationa b o d e
_ b o də
_ b o də
_b o də
_b o də
_b o də
# a b o d e
#
ə
b
o
d
![Page 17: University of Alberta Letter-to-phoneme conversion Sittichai Jiampojamarn sj@cs.ualberta.ca CMPUT 500 / HUCO 612 September 26, 2007](https://reader036.vdocuments.site/reader036/viewer/2022062714/56649d4c5503460f94a2a21a/html5/thumbnails/17.jpg)
University of Alberta
Visualizationa b o d e
_ b o də
_ b o də
_b o də
_b o də
_b o də
# a b o d e
#
ə
b
o
d
![Page 18: University of Alberta Letter-to-phoneme conversion Sittichai Jiampojamarn sj@cs.ualberta.ca CMPUT 500 / HUCO 612 September 26, 2007](https://reader036.vdocuments.site/reader036/viewer/2022062714/56649d4c5503460f94a2a21a/html5/thumbnails/18.jpg)
University of Alberta
Visualizationa b o d e
_ b o də
_ b o də
_b o də
_b o də
_b o də
# a b o d e
#
ə
b
o
d
![Page 19: University of Alberta Letter-to-phoneme conversion Sittichai Jiampojamarn sj@cs.ualberta.ca CMPUT 500 / HUCO 612 September 26, 2007](https://reader036.vdocuments.site/reader036/viewer/2022062714/56649d4c5503460f94a2a21a/html5/thumbnails/19.jpg)
University of Alberta
Visualizationa b o d e
_ b o də
_ b o də
_b o də
_b o də
_b o də
# a b o d e
#
ə
b
o
d
![Page 20: University of Alberta Letter-to-phoneme conversion Sittichai Jiampojamarn sj@cs.ualberta.ca CMPUT 500 / HUCO 612 September 26, 2007](https://reader036.vdocuments.site/reader036/viewer/2022062714/56649d4c5503460f94a2a21a/html5/thumbnails/20.jpg)
University of Alberta
Visualizationa b o d e
_ b o də
_ b o də
_b o də
_b o də
_b o də
# a b o d e
#
ə
b
o
d
![Page 21: University of Alberta Letter-to-phoneme conversion Sittichai Jiampojamarn sj@cs.ualberta.ca CMPUT 500 / HUCO 612 September 26, 2007](https://reader036.vdocuments.site/reader036/viewer/2022062714/56649d4c5503460f94a2a21a/html5/thumbnails/21.jpg)
University of Alberta
Visualizationa b o d e
_ b o də
_ b o də
_b o də
_b o də
_b o də
# a b o d e
#
ə
b
o
d
![Page 22: University of Alberta Letter-to-phoneme conversion Sittichai Jiampojamarn sj@cs.ualberta.ca CMPUT 500 / HUCO 612 September 26, 2007](https://reader036.vdocuments.site/reader036/viewer/2022062714/56649d4c5503460f94a2a21a/html5/thumbnails/22.jpg)
University of Alberta
Visualizationa b o d e
_ b o də
_ b o də
_b o də
_b o də
_b o də
# a b o d e
#
ə
b
o
d
![Page 23: University of Alberta Letter-to-phoneme conversion Sittichai Jiampojamarn sj@cs.ualberta.ca CMPUT 500 / HUCO 612 September 26, 2007](https://reader036.vdocuments.site/reader036/viewer/2022062714/56649d4c5503460f94a2a21a/html5/thumbnails/23.jpg)
University of Alberta
Visualization# a b o d e
#
ə
b
o
d
a b o d e
_ b o də
_ b o də
_b o də
_b o də
_b o də
a b o d e
_b o də
![Page 24: University of Alberta Letter-to-phoneme conversion Sittichai Jiampojamarn sj@cs.ualberta.ca CMPUT 500 / HUCO 612 September 26, 2007](https://reader036.vdocuments.site/reader036/viewer/2022062714/56649d4c5503460f94a2a21a/html5/thumbnails/24.jpg)
University of Alberta
Problems with 1-to-1 alignments
• Double letters: two letters map to one phoneme. (e.g. ng [ŋ], sh [ʃ], ph [f])
k i n g
k i ŋ _
k i n g
k i ŋ_
k i n g
k i ŋ
![Page 25: University of Alberta Letter-to-phoneme conversion Sittichai Jiampojamarn sj@cs.ualberta.ca CMPUT 500 / HUCO 612 September 26, 2007](https://reader036.vdocuments.site/reader036/viewer/2022062714/56649d4c5503460f94a2a21a/html5/thumbnails/25.jpg)
University of Alberta
Problem with 1-to-1 alignments
• Double phonemes: one letter maps to two phonemes. (e.g. x [k s], u [j u])
f u m e
f j u m
f u m e
f j u m
_
_
f u m e
f j u m _
![Page 26: University of Alberta Letter-to-phoneme conversion Sittichai Jiampojamarn sj@cs.ualberta.ca CMPUT 500 / HUCO 612 September 26, 2007](https://reader036.vdocuments.site/reader036/viewer/2022062714/56649d4c5503460f94a2a21a/html5/thumbnails/26.jpg)
University of Alberta
Previous solutions for double phonemes
• Preprocess using a fix list of phonemes.– [k s] -> [X]– [j u] -> [U]
f u m e
f j u m
f u m e
f U m
f u m e
f U m _
![Page 27: University of Alberta Letter-to-phoneme conversion Sittichai Jiampojamarn sj@cs.ualberta.ca CMPUT 500 / HUCO 612 September 26, 2007](https://reader036.vdocuments.site/reader036/viewer/2022062714/56649d4c5503460f94a2a21a/html5/thumbnails/27.jpg)
University of Alberta
Applying many-to-many alignments and Hidden Markov Models to Letter-to-Phoneme conversion
Sittichai Jiampojamarn, Grzegorz Kondrak and Tarek Sherif
Proceedings of the Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL-
HLT 2007), Rochester, NY, April 2007, pp.372-379.
![Page 28: University of Alberta Letter-to-phoneme conversion Sittichai Jiampojamarn sj@cs.ualberta.ca CMPUT 500 / HUCO 612 September 26, 2007](https://reader036.vdocuments.site/reader036/viewer/2022062714/56649d4c5503460f94a2a21a/html5/thumbnails/28.jpg)
University of Alberta
Overview system
Training data
1-1 alignerAligned
dataPhoneme prediction
pronunciation
M-M alignerChunking prediction
Local prediction
HMM
Phoneme prediction
Prediction process
Alignment process
![Page 29: University of Alberta Letter-to-phoneme conversion Sittichai Jiampojamarn sj@cs.ualberta.ca CMPUT 500 / HUCO 612 September 26, 2007](https://reader036.vdocuments.site/reader036/viewer/2022062714/56649d4c5503460f94a2a21a/html5/thumbnails/29.jpg)
University of Alberta
Many-to-many alignments
• EM-based method.
• Extended from the forward-backward training of a one-to-one stochastic transducer (Ristad and Yianilos,
1998).
• Allow one or two letters to map to null, one, or two phonemes.
![Page 30: University of Alberta Letter-to-phoneme conversion Sittichai Jiampojamarn sj@cs.ualberta.ca CMPUT 500 / HUCO 612 September 26, 2007](https://reader036.vdocuments.site/reader036/viewer/2022062714/56649d4c5503460f94a2a21a/html5/thumbnails/30.jpg)
University of Alberta
p h o e n i x
f
i
n
ɪ
k
s
#
# #
#
Many-to-many alignments
p h o e n i x
f i n ɪ k s
![Page 31: University of Alberta Letter-to-phoneme conversion Sittichai Jiampojamarn sj@cs.ualberta.ca CMPUT 500 / HUCO 612 September 26, 2007](https://reader036.vdocuments.site/reader036/viewer/2022062714/56649d4c5503460f94a2a21a/html5/thumbnails/31.jpg)
University of Alberta
Many-to-many alignments
p h o e n i x
f
i
n
ɪ
k
s
#
# #
#
p h o e n i x
f i n ɪ k s
![Page 32: University of Alberta Letter-to-phoneme conversion Sittichai Jiampojamarn sj@cs.ualberta.ca CMPUT 500 / HUCO 612 September 26, 2007](https://reader036.vdocuments.site/reader036/viewer/2022062714/56649d4c5503460f94a2a21a/html5/thumbnails/32.jpg)
University of Alberta
Many-to-many alignments
p h o e n i x
f
i
n
ɪ
k
s
#
# #
#
p h o e n i x
f i n ɪ k s
![Page 33: University of Alberta Letter-to-phoneme conversion Sittichai Jiampojamarn sj@cs.ualberta.ca CMPUT 500 / HUCO 612 September 26, 2007](https://reader036.vdocuments.site/reader036/viewer/2022062714/56649d4c5503460f94a2a21a/html5/thumbnails/33.jpg)
University of Alberta
Prediction problem
• Should the prediction model generate phonemes from one or two letters ?
– gash [g æ ʃ ] gasholder [g æ s h o l d ə r]
g a sh
g æ ʃ
g a s
g æ s
h o l d e r
h o l d ə r
![Page 34: University of Alberta Letter-to-phoneme conversion Sittichai Jiampojamarn sj@cs.ualberta.ca CMPUT 500 / HUCO 612 September 26, 2007](https://reader036.vdocuments.site/reader036/viewer/2022062714/56649d4c5503460f94a2a21a/html5/thumbnails/34.jpg)
University of Alberta
Letter chunking
• A bigram letter chunking prediction automatic discovers double letters.
Ex. longs
l ɒ ŋ z
l o ng s
![Page 35: University of Alberta Letter-to-phoneme conversion Sittichai Jiampojamarn sj@cs.ualberta.ca CMPUT 500 / HUCO 612 September 26, 2007](https://reader036.vdocuments.site/reader036/viewer/2022062714/56649d4c5503460f94a2a21a/html5/thumbnails/35.jpg)
University of Alberta
Overview system
Training data
1-1 alignerAligned
dataPhoneme prediction
pronunciation
M-M alignerChunking prediction
Local prediction
HMM
Phoneme prediction
Prediction process
Alignment process
![Page 36: University of Alberta Letter-to-phoneme conversion Sittichai Jiampojamarn sj@cs.ualberta.ca CMPUT 500 / HUCO 612 September 26, 2007](https://reader036.vdocuments.site/reader036/viewer/2022062714/56649d4c5503460f94a2a21a/html5/thumbnails/36.jpg)
University of Alberta
Phoneme prediction• Once the training examples are aligned, we need a
phoneme prediction model.
• “Classification task” or “sequence prediction”?
P0
L0
P1 P2 P3
L1 L2 L3
#L0L1
L0L1L2
L1L2L3
L2L3#
P0
P1
P2
P3
![Page 37: University of Alberta Letter-to-phoneme conversion Sittichai Jiampojamarn sj@cs.ualberta.ca CMPUT 500 / HUCO 612 September 26, 2007](https://reader036.vdocuments.site/reader036/viewer/2022062714/56649d4c5503460f94a2a21a/html5/thumbnails/37.jpg)
University of Alberta
Instance based learning• Store the training examples.
• The predicted class is assigned by searching the “most similar” training instance.
• The similarity functions: – Hamming distance, Euclidean distance, etc.
æ
Me!!
ɑ
Me!!
ə
Me!!
A
Who do I look like most?
![Page 38: University of Alberta Letter-to-phoneme conversion Sittichai Jiampojamarn sj@cs.ualberta.ca CMPUT 500 / HUCO 612 September 26, 2007](https://reader036.vdocuments.site/reader036/viewer/2022062714/56649d4c5503460f94a2a21a/html5/thumbnails/38.jpg)
University of Alberta
Basic HMMs• A basic sequence-based prediction method.
• In L2P, – letters are observations– phonemes are states
• Output phoneme sequences depend on both emission and transition probabilities.
![Page 39: University of Alberta Letter-to-phoneme conversion Sittichai Jiampojamarn sj@cs.ualberta.ca CMPUT 500 / HUCO 612 September 26, 2007](https://reader036.vdocuments.site/reader036/viewer/2022062714/56649d4c5503460f94a2a21a/html5/thumbnails/39.jpg)
University of Alberta
Applying HMM• Use an instance based learning to produce a list of
candidate phones with confidence values “conf(phonei)” for each letteri. (emission probability).
• Use a language model of phoneme sequence in the training data to obtain transition probability P(phonei | phonei-1, …phonei-n).
![Page 40: University of Alberta Letter-to-phoneme conversion Sittichai Jiampojamarn sj@cs.ualberta.ca CMPUT 500 / HUCO 612 September 26, 2007](https://reader036.vdocuments.site/reader036/viewer/2022062714/56649d4c5503460f94a2a21a/html5/thumbnails/40.jpg)
University of Alberta
Visualization
b / b u / E r / r i / aI
i / I
e / _ d / d0.048 0.067 0.003
0.700
0.008
0.014
0.433
Conf( i / aI) = 0.714
Conf( i / I) = 0.286
Buried -> [ b E r aI d ] = 2.38 x 10-8 Buried -> [ b E r I d ] = 2.23 x 10-6
![Page 41: University of Alberta Letter-to-phoneme conversion Sittichai Jiampojamarn sj@cs.ualberta.ca CMPUT 500 / HUCO 612 September 26, 2007](https://reader036.vdocuments.site/reader036/viewer/2022062714/56649d4c5503460f94a2a21a/html5/thumbnails/41.jpg)
University of Alberta
Evaluation• Data sets
– English: CMUDict (112K), Celex (65K).– Dutch: Celex (116K).– German: Celex (49K).– French: Brulex (27K).
• IB1 algorithm implemented in TiMBL package as the classifier.(W. Daelemans et al., 2004.)
• Results are reported in word accuracy rate based on 10-fold cross validation.
![Page 42: University of Alberta Letter-to-phoneme conversion Sittichai Jiampojamarn sj@cs.ualberta.ca CMPUT 500 / HUCO 612 September 26, 2007](https://reader036.vdocuments.site/reader036/viewer/2022062714/56649d4c5503460f94a2a21a/html5/thumbnails/42.jpg)
University of Alberta
![Page 43: University of Alberta Letter-to-phoneme conversion Sittichai Jiampojamarn sj@cs.ualberta.ca CMPUT 500 / HUCO 612 September 26, 2007](https://reader036.vdocuments.site/reader036/viewer/2022062714/56649d4c5503460f94a2a21a/html5/thumbnails/43.jpg)
University of Alberta
![Page 44: University of Alberta Letter-to-phoneme conversion Sittichai Jiampojamarn sj@cs.ualberta.ca CMPUT 500 / HUCO 612 September 26, 2007](https://reader036.vdocuments.site/reader036/viewer/2022062714/56649d4c5503460f94a2a21a/html5/thumbnails/44.jpg)
University of Alberta
50
55
60
65
70
75
80
85
90
95
CMUDict Eng. Celex DutchCelex
GermanCelex
FrenchBrulex
Wo
rd a
cc
ura
cy
1-1 alignments 1-1 alignments + HMM M-M alignments
![Page 45: University of Alberta Letter-to-phoneme conversion Sittichai Jiampojamarn sj@cs.ualberta.ca CMPUT 500 / HUCO 612 September 26, 2007](https://reader036.vdocuments.site/reader036/viewer/2022062714/56649d4c5503460f94a2a21a/html5/thumbnails/45.jpg)
University of Alberta
![Page 46: University of Alberta Letter-to-phoneme conversion Sittichai Jiampojamarn sj@cs.ualberta.ca CMPUT 500 / HUCO 612 September 26, 2007](https://reader036.vdocuments.site/reader036/viewer/2022062714/56649d4c5503460f94a2a21a/html5/thumbnails/46.jpg)
University of Alberta
Messages
• Many-to-many alignments show significant improvements over one-to-one traditional alignments.
• HMM-like approach helps when a local classify has difficulty to predict phonemes.
![Page 47: University of Alberta Letter-to-phoneme conversion Sittichai Jiampojamarn sj@cs.ualberta.ca CMPUT 500 / HUCO 612 September 26, 2007](https://reader036.vdocuments.site/reader036/viewer/2022062714/56649d4c5503460f94a2a21a/html5/thumbnails/47.jpg)
University of Alberta
Criticism
• Joint models– Alignments, chunking, prediction, and HMM.
• Error propagation– Errors from one model to other models which are
unlikely to re-correct.
• Can we combine and optimize at once ? Or at least allow the system to re-correct past errors ?
![Page 48: University of Alberta Letter-to-phoneme conversion Sittichai Jiampojamarn sj@cs.ualberta.ca CMPUT 500 / HUCO 612 September 26, 2007](https://reader036.vdocuments.site/reader036/viewer/2022062714/56649d4c5503460f94a2a21a/html5/thumbnails/48.jpg)
University of Alberta
On-going work
Discriminative approach
for letter-to-phoneme conversion
![Page 49: University of Alberta Letter-to-phoneme conversion Sittichai Jiampojamarn sj@cs.ualberta.ca CMPUT 500 / HUCO 612 September 26, 2007](https://reader036.vdocuments.site/reader036/viewer/2022062714/56649d4c5503460f94a2a21a/html5/thumbnails/49.jpg)
University of Alberta
Online discriminative learning
• Let x is an input word and y is an output phonemes.
• represents features describing x and y.
• is a weight vector for
![Page 50: University of Alberta Letter-to-phoneme conversion Sittichai Jiampojamarn sj@cs.ualberta.ca CMPUT 500 / HUCO 612 September 26, 2007](https://reader036.vdocuments.site/reader036/viewer/2022062714/56649d4c5503460f94a2a21a/html5/thumbnails/50.jpg)
University of Alberta
Online training algorithm
1. Initially,
2. For k iterations1. For all letter-phoneme sequence pairs (x,y)
1.
2. update weights according to and
![Page 51: University of Alberta Letter-to-phoneme conversion Sittichai Jiampojamarn sj@cs.ualberta.ca CMPUT 500 / HUCO 612 September 26, 2007](https://reader036.vdocuments.site/reader036/viewer/2022062714/56649d4c5503460f94a2a21a/html5/thumbnails/51.jpg)
University of Alberta
Perceptron update (Collins, 2002)
• Simple update training method.
• Try to move the weights to the direction of correct answers when predicting wrong answers.
![Page 52: University of Alberta Letter-to-phoneme conversion Sittichai Jiampojamarn sj@cs.ualberta.ca CMPUT 500 / HUCO 612 September 26, 2007](https://reader036.vdocuments.site/reader036/viewer/2022062714/56649d4c5503460f94a2a21a/html5/thumbnails/52.jpg)
University of Alberta
Examples
• Separable case
Adapted from Dan Klein’s tutorial slides at NAACL 2007.
![Page 53: University of Alberta Letter-to-phoneme conversion Sittichai Jiampojamarn sj@cs.ualberta.ca CMPUT 500 / HUCO 612 September 26, 2007](https://reader036.vdocuments.site/reader036/viewer/2022062714/56649d4c5503460f94a2a21a/html5/thumbnails/53.jpg)
University of Alberta
Examples
• Non-separable case
Adapted from Dan Klein’s tutorial slides at NAACL 2007.
![Page 54: University of Alberta Letter-to-phoneme conversion Sittichai Jiampojamarn sj@cs.ualberta.ca CMPUT 500 / HUCO 612 September 26, 2007](https://reader036.vdocuments.site/reader036/viewer/2022062714/56649d4c5503460f94a2a21a/html5/thumbnails/54.jpg)
University of Alberta
Issues with Perceptron
• Overtraining: test / held-out accuracy usually rises, then falls.
• Regularization: – if the data isn’t separable, weights
often thrash around.
– Finds a “barely” separating solution
Taken from Dan Klein’s tutorial slides at NAACL 2007.
![Page 55: University of Alberta Letter-to-phoneme conversion Sittichai Jiampojamarn sj@cs.ualberta.ca CMPUT 500 / HUCO 612 September 26, 2007](https://reader036.vdocuments.site/reader036/viewer/2022062714/56649d4c5503460f94a2a21a/html5/thumbnails/55.jpg)
University of Alberta
Margin Infused Relaxed Algorithm (MIRA) (Crammer and Singer, 2003)
• Use n-best list to update weights.
• separate by a margin at least as large as a loss function
• and keep the weight changes as small as possible.
![Page 56: University of Alberta Letter-to-phoneme conversion Sittichai Jiampojamarn sj@cs.ualberta.ca CMPUT 500 / HUCO 612 September 26, 2007](https://reader036.vdocuments.site/reader036/viewer/2022062714/56649d4c5503460f94a2a21a/html5/thumbnails/56.jpg)
University of Alberta
Loss function in letter-to-phoneme
• Describe the loss of an incorrect prediction compared to the correct one.
• Word error (0/1), phoneme error, or combination.
![Page 57: University of Alberta Letter-to-phoneme conversion Sittichai Jiampojamarn sj@cs.ualberta.ca CMPUT 500 / HUCO 612 September 26, 2007](https://reader036.vdocuments.site/reader036/viewer/2022062714/56649d4c5503460f94a2a21a/html5/thumbnails/57.jpg)
University of Alberta
Results
• Incomplete !!!– MIRA outperforms Perceptron.
– Using 0/1 loss and combination loss are better than the phoneme loss function alone.
– Overall, results show better performance than previous work.
![Page 58: University of Alberta Letter-to-phoneme conversion Sittichai Jiampojamarn sj@cs.ualberta.ca CMPUT 500 / HUCO 612 September 26, 2007](https://reader036.vdocuments.site/reader036/viewer/2022062714/56649d4c5503460f94a2a21a/html5/thumbnails/58.jpg)
University of Alberta
Possible term projects
![Page 59: University of Alberta Letter-to-phoneme conversion Sittichai Jiampojamarn sj@cs.ualberta.ca CMPUT 500 / HUCO 612 September 26, 2007](https://reader036.vdocuments.site/reader036/viewer/2022062714/56649d4c5503460f94a2a21a/html5/thumbnails/59.jpg)
University of Alberta
Possible term projects
1. Explore more linguistic features.
2. Explore machine translation systems for letter-to-phoneme conversion.
3. Unsupervised approaches for letter-to-phoneme conversion.
4. Other cool ideas to improve on a partial system– Data for evaluation are provided– Alignments are provided.– L2P model are provided.
![Page 60: University of Alberta Letter-to-phoneme conversion Sittichai Jiampojamarn sj@cs.ualberta.ca CMPUT 500 / HUCO 612 September 26, 2007](https://reader036.vdocuments.site/reader036/viewer/2022062714/56649d4c5503460f94a2a21a/html5/thumbnails/60.jpg)
University of Alberta
Linguistic features• Looking for linguistic features to help L2P
– Most systems incorporate letter feature (n-gram) type in some ways.
• The new features (must) be obtained by using (only) word information.
• Works been already done– Syllabification : Susan’s thesis
• Find syllabification break on letters using SVM approach.
![Page 61: University of Alberta Letter-to-phoneme conversion Sittichai Jiampojamarn sj@cs.ualberta.ca CMPUT 500 / HUCO 612 September 26, 2007](https://reader036.vdocuments.site/reader036/viewer/2022062714/56649d4c5503460f94a2a21a/html5/thumbnails/61.jpg)
University of Alberta
Machine translation approach
• L2P problem can be seen as a (simple) machine translation problem.
• Where, we’d like to translate letters to phonemes. – Consider: L2P MT
• Letters words• Words sentences• Phonemes target sentences
• Moses -- a baseline SMT system, ACL 2007– http://www.statmt.org/wmt07/baseline.html
– May need to also look at GIZA++, Pharaoh, Carmel, etc.
![Page 62: University of Alberta Letter-to-phoneme conversion Sittichai Jiampojamarn sj@cs.ualberta.ca CMPUT 500 / HUCO 612 September 26, 2007](https://reader036.vdocuments.site/reader036/viewer/2022062714/56649d4c5503460f94a2a21a/html5/thumbnails/62.jpg)
University of Alberta
Unsupervised approaches
• Assuming, we don’t have examples of word-phoneme pairs to train a model.
• We can start from a list of possible letter-phoneme mappings
• Or assuming, we have a small set of example pairs (~100 pairs).
• Don’t expect to outperform the supervised approach but take advantage of being unsupervised methods
![Page 63: University of Alberta Letter-to-phoneme conversion Sittichai Jiampojamarn sj@cs.ualberta.ca CMPUT 500 / HUCO 612 September 26, 2007](https://reader036.vdocuments.site/reader036/viewer/2022062714/56649d4c5503460f94a2a21a/html5/thumbnails/63.jpg)
University of Alberta
References• Collins, M. 2002. Discriminative training methods for hidden Markov models: theory and
experiments with perceptron algorithms. In Proceedings of the Acl-02 Conference on Empirical Methods in Natural Language Processing - Volume 10 Annual Meeting of the ACL. Association for Computational Linguistics, Morristown, NJ, 1-8
• Crammer, K. and Singer, Y. 2003. Ultraconservative online algorithms for multiclass problems. J. Mach. Learn. Res. 3 (Mar. 2003), 951-991.
• Kristina Toutanova and Robert C. Moore. 2001. “Pronunciation modeling for improved spelling correction”. In ACL’02: pp144-151, 2001.
• John Kominek and Alan W Black, “Learning Pronunciation Dictionaries Language Complexity and Word Selection Strategies”, NAACL06, pp. 232-239, 2006.
• Walter M. P. Daelemans and Antal P. J. van den Bosch. 1997. “Language-independent data-oriented grapheme-to-phoneme conversion.” In Progress in Speech Synthesis, pages 77.89. Springer, New York.
• Alan W Black, Kevin Lenzo, and Vincent Pagel. 1998. “Issues in building general letter to sound rules”. In The Third ESCA Workshop in Speech Synthesis, pages 77-80.
![Page 64: University of Alberta Letter-to-phoneme conversion Sittichai Jiampojamarn sj@cs.ualberta.ca CMPUT 500 / HUCO 612 September 26, 2007](https://reader036.vdocuments.site/reader036/viewer/2022062714/56649d4c5503460f94a2a21a/html5/thumbnails/64.jpg)
University of Alberta
References• Robert I. Damper, Yannick Marchand, John DS. Marsters, and Alexander I. Bazin. 2005.
“Aligning text and phonemes for speech technology applications using an EM-like algorithm”, International Journal of Speech Technology, 8(2):147-160, June 2005.
• Eric Sven Ristad and Peter N. Yianilos. 1998. “Learning string-edit distance.” IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(5):522.532.
• Walter Daelemans, Jakub Zavrel, Ko Van Der Sloot, and Antal Van Den Bosch. 2004. “TiMBL: Tilburg Memory Based Leaner, version 5.1, reference guide.” In ILK Technical Report Series 04-02., 2004.