spoken dialog systems and voice xml lecturer: prof. esther levin
TRANSCRIPT
![Page 1: Spoken Dialog Systems and Voice XML Lecturer: Prof. Esther Levin](https://reader036.vdocuments.site/reader036/viewer/2022072016/56649ee75503460f94bf7ae7/html5/thumbnails/1.jpg)
Spoken Dialog Systems and Voice XML
Lecturer: Prof. Esther Levin
![Page 2: Spoken Dialog Systems and Voice XML Lecturer: Prof. Esther Levin](https://reader036.vdocuments.site/reader036/viewer/2022072016/56649ee75503460f94bf7ae7/html5/thumbnails/2.jpg)
Description
Spoken dialogue systems enable users to interact with computer systems via natural and intelligent dialogues, as they would with human agents.
Development of such systems requires a wide range of speech and language technologies, including automatic speech recognition (ASR), to convert audio signals
of human speech into text strings, natural language and dialogue processing (NLP), to determine
the meanings and intentions of the recognized utterances and to generate a cooperative response to them,
and text-to-speech synthesis (TTS), to convert the system utterance into actual speech output.
![Page 3: Spoken Dialog Systems and Voice XML Lecturer: Prof. Esther Levin](https://reader036.vdocuments.site/reader036/viewer/2022072016/56649ee75503460f94bf7ae7/html5/thumbnails/3.jpg)
VoiceXML
VoiceXML is the HTML of the voice web, the open standard markup language for voice applications. HTML assumes a graphical web browser with display,
keyboard, and mouse, VoiceXML assumes a voice browser with audio output,
audio input, and keypad input. Audio input is handled by the voice browser's speech recognizer. Audio output consists both of recordings and speech synthesized by the voice browser's text-to-speech system.
VoiceXML takes advantage of several trends: The growth of the World-Wide Web and of its capabilities. Improvements in computer-based speech recognition and
text-to-speech synthesis. The spread of the WWW beyond the desktop computer
![Page 4: Spoken Dialog Systems and Voice XML Lecturer: Prof. Esther Levin](https://reader036.vdocuments.site/reader036/viewer/2022072016/56649ee75503460f94bf7ae7/html5/thumbnails/4.jpg)
Project Scope
We will be designing, developing, testing and deploying spoken dialog system for a variety of applications.
Upon the successful completion of this project a student should be able to: understand the main functional components of a typical spoken
language processing system; have a detailed knowledge of the basic elements of spoken
language technology, such as patterns recognition, Hidden Markov Models, and speech recognition
have practical experience of speech recognition technologies and of spoken dialogue system development using VoiceXML;
appreciate current research issues in spoken language technology and be aware of its commercial applications
![Page 5: Spoken Dialog Systems and Voice XML Lecturer: Prof. Esther Levin](https://reader036.vdocuments.site/reader036/viewer/2022072016/56649ee75503460f94bf7ae7/html5/thumbnails/5.jpg)
Logistics
In this project-based course, students are grouped into teams to work on projects involved with design, implementation and testing of spoken dialog systems.
The capstone course will last two semesters. In the first semester, we will study key technologies involved in
this multi-disciplinary field. The second semester will focus on implementation of exciting
real-world dialog systems using the Voice XML platform. There will be two kinds of lectures:
focus on technologies and theory (Pattern Recognition, Hidden Markov Models, Automated Speech Recognition, Spoken Dialog Systems, etc);
focus on different aspects of VoiceXML For most of the semester we will alternate between the two kinds
of lectures on a weekly basis.
The course material will be entirely self-contained
![Page 6: Spoken Dialog Systems and Voice XML Lecturer: Prof. Esther Levin](https://reader036.vdocuments.site/reader036/viewer/2022072016/56649ee75503460f94bf7ae7/html5/thumbnails/6.jpg)
Requirements and Grading
Fall 2006: There will be 5-8 assignments. Some of the assignments will be research to be presented
in class. Attendance is mandatory. Passing grade from the ethics class is required to pass
this course. Spring 2007:
Group meeting Project evaluation Project report
![Page 7: Spoken Dialog Systems and Voice XML Lecturer: Prof. Esther Levin](https://reader036.vdocuments.site/reader036/viewer/2022072016/56649ee75503460f94bf7ae7/html5/thumbnails/7.jpg)
Text
McTear, Michael, Spoken Dialogue Technology - Towards the Conversational User Interface. Springer Verlag, 2004
![Page 8: Spoken Dialog Systems and Voice XML Lecturer: Prof. Esther Levin](https://reader036.vdocuments.site/reader036/viewer/2022072016/56649ee75503460f94bf7ae7/html5/thumbnails/8.jpg)
Webpage
http://www-cs.ccny.cuny.edu/~esther/capstone
Visit before every lecture for the latest announcements
![Page 9: Spoken Dialog Systems and Voice XML Lecturer: Prof. Esther Levin](https://reader036.vdocuments.site/reader036/viewer/2022072016/56649ee75503460f94bf7ae7/html5/thumbnails/9.jpg)
What is Voice XML?
VoiceXML Architecture Voice XML basics Speech Recognition Text-To-Speech