ways to generate computer speech record a human speaking every sentence hal will ever speak (not...
TRANSCRIPT
Ways to generate computer speech
• Record a human speaking every sentence HAL will ever speak (not likely)
• Make a mathematical model of the human vocal tract (synthesis)
• Record a human speaking a lot of sentences, and come up with some way of making new sentences out of the recorded ones (concatenation)
What goes into synthesizing speech?
• Have some idea of what human speech actually looks/sounds like– Modeling the shape of a speaker’s mouth– Fricative noises and noises from stops– Pitch changes
• Produce sounds that resemble speech sounds
Synthesis: Putting it all together
• Shape of mouth: 1: 2: 3: all 3:
• Fricative and burst noises:• Shape of mouth and fricative noises:• Shape of mouth, fricative noises, & pitch:
Speech synthesis
• (1980): The Speak & Spell toy used a synthesis process called Linear Predictive Coding (LPC).
• Basically, LPC is a way for a computer to extract all of the different parts of speech from a speech signal, and re-create them using a mathematical model of the vocal tract
• Here’s a better example of LPC (1982):
• LPC is used today for GSM phone systems
Text-to-Speech (TTS) systems• Concatenative synthesis
– Record natural speech– Chop speech up into units– Recombine units according to the phonetic
transcription to be pronounced
• Steps for a TTS system:– Start w/ written text– Convert text to phonetic characters– Find segments of speech in database– Calculate intonation of sentence
Text-to-Speech (TTS) systems
Examples of text from The North Wind and the Sun (Aesop), circa 2005:
• Mike (AT&T)
• Crystal (AT&T)
• British English (Rhetorical Systems)
• Scottish English (Rhetorical Systems)