speech processing - sharifce.sharif.edu/courses/93-94/2/ce967-1/resources/root/lecture slides... ·...
TRANSCRIPT
![Page 1: Speech Processing - Sharifce.sharif.edu/courses/93-94/2/ce967-1/resources/root/Lecture Slides... · Introduction to Digital Speech Processing Lawrence R. Rabiner and Ronald W. Schafer,](https://reader033.vdocuments.site/reader033/viewer/2022052201/5b6d82c87f8b9aa5478ce189/html5/thumbnails/1.jpg)
1
Speech Processing
![Page 2: Speech Processing - Sharifce.sharif.edu/courses/93-94/2/ce967-1/resources/root/Lecture Slides... · Introduction to Digital Speech Processing Lawrence R. Rabiner and Ronald W. Schafer,](https://reader033.vdocuments.site/reader033/viewer/2022052201/5b6d82c87f8b9aa5478ce189/html5/thumbnails/2.jpg)
2
Speech Processing: Review of DSP Concepts
Review of Probability and Stochastic Processes
Anatomy and Physiology of Speech Production System
Phonemics and Phonetics
Spectrogram Reading
Linear Prediction Analysis
Speech Coding and Compression
Speech Synthesis (Text to Speech)
Speech Quality Assessment (Subjective and
Objective)
Speech Recognition (Speech to Text)
Speech Enhancement
![Page 3: Speech Processing - Sharifce.sharif.edu/courses/93-94/2/ce967-1/resources/root/Lecture Slides... · Introduction to Digital Speech Processing Lawrence R. Rabiner and Ronald W. Schafer,](https://reader033.vdocuments.site/reader033/viewer/2022052201/5b6d82c87f8b9aa5478ce189/html5/thumbnails/3.jpg)
3
Speech Processing:
Marking Scheme:
Homeworks:10%
Projects : 15%
Quizzes: 20%
Midterm: 25%
Final Exam: 30%
![Page 4: Speech Processing - Sharifce.sharif.edu/courses/93-94/2/ce967-1/resources/root/Lecture Slides... · Introduction to Digital Speech Processing Lawrence R. Rabiner and Ronald W. Schafer,](https://reader033.vdocuments.site/reader033/viewer/2022052201/5b6d82c87f8b9aa5478ce189/html5/thumbnails/4.jpg)
4
Speech Processing:
Text: Spoken language processing
Huang, Acero, Hon, 2000
Introduction to Digital Speech Processing Lawrence R. Rabiner and Ronald W. Schafer, 2007
Discrete time processing of speech Signals Deller,Proakis,Hansen,1993
Fundamentals of speech recognition Rabiner,Juang,1993
Password for any documents for the course:
40967spring93
![Page 5: Speech Processing - Sharifce.sharif.edu/courses/93-94/2/ce967-1/resources/root/Lecture Slides... · Introduction to Digital Speech Processing Lawrence R. Rabiner and Ronald W. Schafer,](https://reader033.vdocuments.site/reader033/viewer/2022052201/5b6d82c87f8b9aa5478ce189/html5/thumbnails/5.jpg)
ارسطو:
.انسان،حيوانناطقاست
5
![Page 6: Speech Processing - Sharifce.sharif.edu/courses/93-94/2/ce967-1/resources/root/Lecture Slides... · Introduction to Digital Speech Processing Lawrence R. Rabiner and Ronald W. Schafer,](https://reader033.vdocuments.site/reader033/viewer/2022052201/5b6d82c87f8b9aa5478ce189/html5/thumbnails/6.jpg)
Old Speech Synthesizers
– Speech organ of Wheatstone, based on a system proposed by Wolfgang
von Kempelen in 1791
![Page 7: Speech Processing - Sharifce.sharif.edu/courses/93-94/2/ce967-1/resources/root/Lecture Slides... · Introduction to Digital Speech Processing Lawrence R. Rabiner and Ronald W. Schafer,](https://reader033.vdocuments.site/reader033/viewer/2022052201/5b6d82c87f8b9aa5478ce189/html5/thumbnails/7.jpg)
Old Speech Synthesizers
(cont’d)
– Speech organ of Joseph Faber (1830-40)
![Page 8: Speech Processing - Sharifce.sharif.edu/courses/93-94/2/ce967-1/resources/root/Lecture Slides... · Introduction to Digital Speech Processing Lawrence R. Rabiner and Ronald W. Schafer,](https://reader033.vdocuments.site/reader033/viewer/2022052201/5b6d82c87f8b9aa5478ce189/html5/thumbnails/8.jpg)
Old Speech Synthesizers
(cont’d)– Voder demonstrated in 1939
Source: http://www.ling.su.se/staff/hartmut/kemplne.htm
![Page 9: Speech Processing - Sharifce.sharif.edu/courses/93-94/2/ce967-1/resources/root/Lecture Slides... · Introduction to Digital Speech Processing Lawrence R. Rabiner and Ronald W. Schafer,](https://reader033.vdocuments.site/reader033/viewer/2022052201/5b6d82c87f8b9aa5478ce189/html5/thumbnails/9.jpg)
More modern labs (ICP lab in Grenoble, France)
– Study of the face movements to be included in speech synthesis (and
recognition).
![Page 10: Speech Processing - Sharifce.sharif.edu/courses/93-94/2/ce967-1/resources/root/Lecture Slides... · Introduction to Digital Speech Processing Lawrence R. Rabiner and Ronald W. Schafer,](https://reader033.vdocuments.site/reader033/viewer/2022052201/5b6d82c87f8b9aa5478ce189/html5/thumbnails/10.jpg)
Communication via Spoken Language
![Page 11: Speech Processing - Sharifce.sharif.edu/courses/93-94/2/ce967-1/resources/root/Lecture Slides... · Introduction to Digital Speech Processing Lawrence R. Rabiner and Ronald W. Schafer,](https://reader033.vdocuments.site/reader033/viewer/2022052201/5b6d82c87f8b9aa5478ce189/html5/thumbnails/11.jpg)
Communication via Spoken Language
![Page 12: Speech Processing - Sharifce.sharif.edu/courses/93-94/2/ce967-1/resources/root/Lecture Slides... · Introduction to Digital Speech Processing Lawrence R. Rabiner and Ronald W. Schafer,](https://reader033.vdocuments.site/reader033/viewer/2022052201/5b6d82c87f8b9aa5478ce189/html5/thumbnails/12.jpg)
Virtues of Spoken Language
Natural: Requires no special training
Flexible: Leaves hands and eyes free
Efficient: Has high data rate
Economical: Communicated inexpensively
Expressive: Conveys more than just words
Popular/preferred: Verbal-acoustic problem solving
Much longer evolution, compared to written language
![Page 13: Speech Processing - Sharifce.sharif.edu/courses/93-94/2/ce967-1/resources/root/Lecture Slides... · Introduction to Digital Speech Processing Lawrence R. Rabiner and Ronald W. Schafer,](https://reader033.vdocuments.site/reader033/viewer/2022052201/5b6d82c87f8b9aa5478ce189/html5/thumbnails/13.jpg)
Virtues of Spoken Language
Speech interfaces are ideal for
information access and management
when:
The information space is broad and complex,
The users are not allowed (or at ease or capable) to use
their eyes to read text messages,
The users are technically naive, or
Only telephones are available.
![Page 14: Speech Processing - Sharifce.sharif.edu/courses/93-94/2/ce967-1/resources/root/Lecture Slides... · Introduction to Digital Speech Processing Lawrence R. Rabiner and Ronald W. Schafer,](https://reader033.vdocuments.site/reader033/viewer/2022052201/5b6d82c87f8b9aa5478ce189/html5/thumbnails/14.jpg)
Diverse Sources of Constraint for
Spoken Language Communication
Acoustic: human vocal tract
Phonetic: let us pray
lettuce spray
Phonological: gas shortage
fish sandwich
Phonotactic: sprachst (german)
Syntactic: I am flying to Chicago tomorrow
tomorrow I flying Chicago am to
Semantic: Is the baby crying
Is the bay bee crying
Contextual: It is easy to recognize speech
It is easy to wreck a nice beach
![Page 15: Speech Processing - Sharifce.sharif.edu/courses/93-94/2/ce967-1/resources/root/Lecture Slides... · Introduction to Digital Speech Processing Lawrence R. Rabiner and Ronald W. Schafer,](https://reader033.vdocuments.site/reader033/viewer/2022052201/5b6d82c87f8b9aa5478ce189/html5/thumbnails/15.jpg)
A Conversational System Architecture
![Page 16: Speech Processing - Sharifce.sharif.edu/courses/93-94/2/ce967-1/resources/root/Lecture Slides... · Introduction to Digital Speech Processing Lawrence R. Rabiner and Ronald W. Schafer,](https://reader033.vdocuments.site/reader033/viewer/2022052201/5b6d82c87f8b9aa5478ce189/html5/thumbnails/16.jpg)
Demo: Conversational
Interface Jupiter weather information system
Access through telephone
500 cities worldwide
Harvest weather information from the Web
several times daily