11
SSML Extensions for TTS in Indian SSML Extensions for TTS in Indian LanguagesLanguages
II workshop on Internationalizing SSML II workshop on Internationalizing SSML 30-31 May 2006, Greece30-31 May 2006, Greece
Nixon Patel and Kishore PrahalladNixon Patel and Kishore Prahallad
Bhrigus Inc. Hyderabad, IndiaBhrigus Inc. Hyderabad, India
IIIT Hyderabad, IndiaIIIT Hyderabad, India
2© Copyright 2006, Bhrigus Software Private Limited.
About Bhrigus
Collaborative Efforts between Bhrigus and IIIT Hyderabad
Nature of Indian language scripts – convergence and divergence
Issues across TTS rendering in all these languages
Proposed solutions/tags:
Syllable Element
Alien Element
Dialect Element
Topics
3© Copyright 2006, Bhrigus Software Private Limited.
Bhrigusvoice & data solutions
http://www.bhrigus.com
4© Copyright 2006, Bhrigus Software Private Limited.
Established : 2002
Business : Providing IVR, Speech &
Enterprise solutions to BFSI,
Telco’s, contact centers &
manufacturing companies.
Key Customers : Hewitt Associates,
AT&T, Pfizer, Merrill Lynch,
Union pacific railroad, CDIA,
South western energy,
Orange county, Stryker
SEI CMM Level 4 Process Implementation undergoing, ISO 9001: 2000 – KPMG certified.
About Bhrigus
5© Copyright 2006, Bhrigus Software Private Limited.
Playing a leadership role in the development of ASR and TTS for all official Indian languages to provide voice solutions for Indian market
Collaborations: IIIT Hyderabad, & Carnegie Mellon University 10 member team + board of advisors
3 PhDs and 4 Masters Synthesis team, Recognition team, Linguist team and Language
resources team
Initiating SSML and VXML chapters in India
Speech and Language Technology Lab @ Bhrigus
6© Copyright 2006, Bhrigus Software Private Limited.
Bhrigus Inc. Hyderabad – Voice based solution providers
IIIT Hyderabad – one of the leading universities in India doing speech research
Telugu TTS – Collaborative Efforts between Bhrigus Inc. and IIIT
Goal: Develop ASR and TTS for all official Indian languages
Collaborative Efforts
7© Copyright 2006, Bhrigus Software Private Limited.
Basic units of the writing system are Aksharas
An Akshara is an orthographic representation of a speech sound
Akshara is syllabic in nature, typical forms are V, CV, CCV and CCCV (C – consonant, V – vowel)
Always ends with a vowel (or nasalized vowel) in written form
~1652 dialects/native languages
22 languages officially recognized
Nature of Indian Language (IL) Scripts
8© Copyright 2006, Bhrigus Software Private Limited.
Aksharas are syllabic in nature
Common phonetic base
Share a common set of speech sounds across all languages
Fairly good (though not exact) correspondence between sequence of Aksharas and the corresponding sequence of sounds
Often referred to as Letter-to-sound rules
Written from left-to-right as in European languages
Words are separated by space as in European languages
Convergence of IL Scripts
9© Copyright 2006, Bhrigus Software Private Limited.
Each IL has its own script
All IL share a common phonetic base – however, Phonotactics in each IL are different from each other
IL are non-tonal languages unlike eastern languages such as Chinese
Divergence of IL Scripts
10© Copyright 2006, Bhrigus Software Private Limited.
Unicode
Useful for *rendering* the Indian language scripts
Not suitable for keying-in through QWERTY key board
Not suitable to build modules such as text-normalization (can’t see the Unicode characters on many editors)
Itrans-3 / OM - A transliteration scheme by IISc Bangalore, India and Carnegie Mellon University
Useful for *keying-in and store* the scripts of Indian language using QWERTY keyboards
Useful for processing and writing modules/rules for letter-to-sound, text normalization etc.
How to represent Indian language Scripts
11© Copyright 2006, Bhrigus Software Private Limited.
Itrans-3 / OM Notation
12© Copyright 2006, Bhrigus Software Private Limited.
Developed from the user readability aspects – Easier to read and type
It is case-insensitive.
This scheme is phonetic in nature, the characters corresponds to the actual sound that is being spoken.
Thus a single transliteration scheme is used for all the Indian languages, as they share the same set of sounds.
Each character (corresponding to a phone/sound) is not more than three letters length.
Adapted across Universities in India/Abroad and some industrial labs such as Bhrigus Inc.
Why Itrans-3/OM?
13© Copyright 2006, Bhrigus Software Private Limited.
TTS should be able to pronounce words as Akshara (syllable) by Akshara (syllable)
Languages have heavy influence of English (alien) words
Alien words occur in between the sentences
Each language has its own dialect
Issues in TTS rendering in IL
14© Copyright 2006, Bhrigus Software Private Limited.
<phoneme alphabet="itrans-3" ph="n aa t oo"> naatoo </phoneme>
Ph attribute specifies phoneme/phone string
Rendering “n” “aa” “t” “oo” individually does not make sense to the native speakers of Indian languages
Sounds needs to be rendered in terms of syllables
SSML Tag: Phoneme Element <phoneme>
15© Copyright 2006, Bhrigus Software Private Limited.
<syllable alphabet="itrans-3" syl="naa too"> naatoo </syallable>
Render “naa” and “too” which are Aksharas (syllables)
Syllable Element <syllable>
16© Copyright 2006, Bhrigus Software Private Limited.
Informal experiments suggested 33% of errors of TTS of IL occur while rendering alien (non-native) words
Such alien words could be automatically detected due to syllabic properties of the Indian languages
Motivation for Loan Word <alien>
17© Copyright 2006, Bhrigus Software Private Limited.
BANK has to be pronounce as /B/ /AE/ /N/ /K/
/AE/ phoneme does not exist in Indian language phone set
<alien> baank </alien>
Alien (non-native) words could be rendered using different pronunciation dictionaries or letter-to-sound rules
Example of loan word
18© Copyright 2006, Bhrigus Software Private Limited.
Each language has its own dialect
TTS should be able to handle dialects without unloading the language resources
Dialect Element <dialect>
19© Copyright 2006, Bhrigus Software Private Limited.
<?xml version="1.0"?><speak version="1.0" xml:lang="tel-in">
<voice gender="female"> <dialect name = “andhra”> yekkad’iki vel’laali
</dialect> <dialect name = “telengana” pro = “yaad’iki
poovaale”> yekkad’iki vel’laali </dialect> </voice></speak>
Dialect Element <dialect>
20© Copyright 2006, Bhrigus Software Private Limited.
Bhrigus Inc. Hyderabad taking lead position to develop ASR and TTS for Indian languages
Proposed <syllable> <alien> <dialect> elements for SSML extensions
Conclusions
21© Copyright 2006, Bhrigus Software Private Limited.
ReferencesReferences
1.1. Prahallad Lavanya, Prahallad KishorePrahallad Lavanya, Prahallad Kishore and GanapathiRaju and GanapathiRaju Madhavi, Madhavi, A Simple Approach for Building Transliteration A Simple Approach for Building Transliteration Editors for Indian LanguagesEditors for Indian Languages, Journal of Zhejiang , Journal of Zhejiang University Science, vol.6A, no.11, pp. 1354-1361, Oct University Science, vol.6A, no.11, pp. 1354-1361, Oct 2005.2005.