[k mpjutey nl] [fown l d i] speech recognition and text-to-speech systems
TRANSCRIPT
[] []
Speech Recognition
And
Text-to-Speech Systems
Phonology
• Phonetic alphabets
• Phonological rules
• Computational Phonology
• Phonological Learning
• Optimality Theory
Phonetics
• Study of the pronunciation of words
• Words are strings of symbols which represent phones
• Can also include prosody
Phonetic Alphabets
• International Phonetic Alphabet (IPA)– Evolving standard since 1888– Goal is to be able to transcribe the sounds of
all human languages
Phonetic SymbolsIPA
• International Phonetic Alphabet
• Evolving standard since 1988
• Goal is to be able to transcribe sounds of all human languages
ARPAbet
• Specifically for American English
• can be used where non-ASCII fonts are inconvenient (such as in online pronunciation dictionaries)
Phonological Rules
• Not all [t]s are created equally
• Phones are pronounced differently in different contexts (phoneme vs. allophone)
• e.g. [t] in tunafish is aspirated
• e.g. [t] in starfish (following initial s) is unaspirated
• Broad transcription vs. narrow transcription
Phonological Rules
• ladder
• lotus
t
d{ } [ ] / V__V
Two-Level Morphology
• Koskenniemi (1983)• Most phonological rules are independent• Feeding and bleeding relations are rare• Explicitly code when rule is obligatory or optional
Rule type Interpretationa:b c ___ d a is always realized as b in the context c ___ d
a:b c ___ d a may be realize as b only in the context c ___ d
a:b c ___ d
a must be realized as b in the context c ___ d and nowhere else
a:b / c ___ d a is never realized as b in the context c ___ d
Optimality Theory (OT)
• Prince and Smolensky, 1993
• Is a Connectionist theory of language
• Views phonological derivation based on:– Two functions (GEN and EVAL) and– A set of ranked violable constraints (CON)
• Assumed to be cross-linguistic generalizatoins
Optimality Theory (OT)
• Given underlying form:– GEN function produces all imaginable surface
forms – EVAL function then applies each constraint in
CON to these surface forms in order of constraint rank
Optimality Theory (OT)
• Constraints– Faithfulness (checks how faithful the surface
form is to the underlying form)• e.g. FaithV—says “Don’t delete or insert vowels”• e.g. FaithC—says “Don’t delete or insert
consonants”
– Markedness (imposes requirements on the structural well-formedness of the output)
• e.g. *Complex –says “no complex onsets or codas”
http://en.wikipedia.org/wiki/Optimality_Theory
Optimality Theory (OT)
• Uses constraints to filter out unneeded surface forms
• Some constraints are more important than others
Optimality Theory (OT)
• Can OT be implemented by finite-state transducers?
• Is essential to enforce constraint only if does not reduce possibilities to zero
Optimality Theory (OT)
• Ordinal OT grammars– Tesar & Smolensky (1998) – No absolute ranking values
• i.e. they accepted only an ordinal relation between the constraint rankings
– learning algorithm (Error-Driven Constraint Demotion, EDCD)
• changes the ranking order whenever the form produced is different from the adult form
– Fast and convergent, but extremely sensitive to errors in the learning data
http://www.fon.hum.uva.nl/praat/manual/OT_learning_1__Kinds_of_OT_grammars.html
Optimality Theory (OT)
• Stochastic OT grammars– Boersma (1997b) / Boersma (1998) / Boersma (2000) – every constraint has a ranking value along a
continuous ranking scale – a small amount of noise is added to this ranking
value at evaluation time – associated error-driven learning algorithm (Gradual
Learning Algorithm, GLA) effects small changes in the ranking values of the constraints with every learning step
– can learn languages with optionality and variation
http://www.fon.hum.uva.nl/praat/manual/OT_learning_1__Kinds_of_OT_grammars.html
SIGMORPHON
• ACL Special Interest Group on Computational Morphology and Phonology (SIGMORPHON)
• formerly known as the ACL Special Interest Group on Computational Phonology (SIGPHON )
• Recent research developments• Matters of interest in computational
phonology and morphology
http://salad.cs.swarthmore.edu/sigphon/systems.shtml