topics in artificial intelligence: discourse and dialogue cs 359 gina-anne levow september 25, 2001
TRANSCRIPT
Topics in Artificial Intelligence:
Discourse and DialogueCS 359
Gina-Anne Levow
September 25, 2001
2
Course Information
Web page: http://www.classes/cs.uchicago.edu/classes/archive/2001/fall/CS359
Instructor: Gina-Anne LevowOffice Hours: TTH 1:30-2:30, RY 162C
3
Grading
• Discussion-oriented class– Email discussion topics before each session
• 10% Class participation• 20% Homework exercises• 20% Each article presentation (up to 2)• 30-50% Term project
4
Spoken Language System: Data Flow
DiscourseInterpretation
DialogueManagement
SignalProcessing
SpeechRecognition
SemanticInterpretation
ResponseGeneration
Speech Synthesis
Discourse&
Dialogue
5
Question-Answering System: Data Flow
DocumentRetrieval
TokenizationSyntacticAnalysis
SemanticAnalysis
AnswerSelection
Semantic Analysis
Question TypeAnalysis
Syntactic Analysis
DiscourseInterpretation
DocumentCollection
Question
Answer
6
Discourse & Dialogue: Overview
Discourse and dialogue
– Discourse interpretation
– Dialogue management
• Dialogue evaluation
7
Discourse & Dialogue Processing
• Discourse interpretation:– Correctly interpret meaning of utterance in context
• Reference: Pronouns• Intention: Goal of utterance, Relationships among utterances
• Dialogue management:– Develop appropriate goals to respond to conversational partner
• Finite-state, Template-based, Agent-based– Manage interaction
• Turn-taking, Initiative, Openings, Politeness
8
Discourse Interpretation
• Goal: understand what the user really intends
• Example: Can you move it?– What does “it” refer to?– Is the utterance intended as a simple yes-no query or a request to
perform an action?
• Issues addressed: – Reference resolution– Intention recognition
From Caroenter and Chu-Carroll, Tutorial on Spoken Dialogue Systems, ACL ‘99
9
U: Where is A Bug’s Life playing in Summit?S: A Bug’s Life is playing at the Summit theater.U: When is it playing there?S: It’s playing at 2pm, 5pm, and 8pm.U: I’d like 1 adult and 2 children for the first show. How much would that cost?
Reference Resolution
• Knowledge sources:– Domain knowledge– Discourse knowledge– World knowledge
From Caroenter and Chu-Carroll, Tutorial on Spoken Dialogue Systems, ACL ‘99
10
Reference Resolution: Global Focus/ Task
• (From Grosz “Typescripts of Task-oriented Dialogues”)
• E: Assemble the air compressor.• .• .• … 30 minutes later…• E: Plug it in / See if it works
• (From Grosz)• E: Bolt the pump to the base
plate• A: What do I use?• ….• A: What is a rachet wrench?• E: Show me the table. The
rachet wrench is […]. Show it to me.
• A: It is bolted. What do I do now?
11
Reference Resolution
• Local structure: Recent frequent mention..
• Global structure: Task structure, – Subdialogues for clarification
• Models: Focus stacks, Centering
12
Relation Recognition: Intention
• A: You seem very quiet today; is there a problem?
• B: I have a headache.
• Answer
• A: Would you be interested in going to dinner tonight?
• B: I have a headache.
• Reject
13
Relation Recognition: Intention (Cont’d)
• Goals: Match utterance with 1+ dialogue acts, capture information
• Sample dialogue actions:
– Verbmobil• Greet/Thank/Bye• Suggest• Accept/Reject• Confirm• Clarify-Query/Answer• Give-Reason• Deliberate
14
Relation Recognition: Intention
• Knowledge sources:– Overall dialogue goals– Orthographic features, e.g.:
• punctuation• cue words/phrases: “but”, “furthermore”, “so”• transcribed words: “would you please”, “I want to”
– Dialogue history, i.e., previous dialogue act types– Dialogue structure, e.g.:
• subdialogue boundaries• dialogue topic changes
– Prosodic features of utterance: duration, pause, F0, speaking rate
Empirical methods/ Manual rule construction:Probabilistic dialogue act classifiers: HMMsRule-based dialogue act recognition: CART, Transformation-based learning
15
U: What time is A Bug’s Life playing at the Summit theater?
Intention Recognition: Example
• Using keyword extraction and vector-based similarity measures:– Intention: Ask-Reference: _time– Movie: A Bug’s Life– Theater: the Summit quadplex
From Caroenter and Chu-Carroll, Tutorial on Spoken Dialogue Systems, ACL ‘99
16
Relation Recognition: Information
• Goal: determine the informational relations between adjacent utterances or spans
• Examples:– Antz is not playing at the Maplewood theater; [Nucleus]
• the theater’s under renovation. (evidence) [Satellite]
– Would you like the suite? [Nucleus]• It’s the same price as the regular room. (motivation) [Satellite]
– Can you get the groceries from the car? [Nucleus]• The key is on the dryer. (enablement) [Satellite]
17
Relation Recognition: Information• Knowledge sources:
– Domain knowledge base– User beliefs– User model: user characteristics, preferences, etc.– Dialogue history
• Information relation mechanisms:– Schemas: patterns of predicates– Rule-based recognition– Plan-based recognition:
• Recipes: templates for performing actions• Planner: to construct plans for given goal
– Case-based reasoningEmpirical methods/ Manual rule construction:
Probabilistic dialogue act classifiers: HMMsRule-based dialogue act recognition: CART, Transformation-based learning
18
Discourse & Dialogue: Overview
Discourse and dialogue
– Discourse interpretation
– Dialogue management
• Dialogue evaluation
19
Dialogue Management: Motivating Examples
• Dialogue 1:
S: Would you like movie showtime or theater playlist information?U: Movie showtime.S: What movie do you want showtime information about?U: Saving Private Ryan.S: At what theater do you want to see Saving Private Ryan?U: Paramount theater.S: Saving Private Ryan is not playing at the Paramount theater.
From Caroenter and Chu-Carroll, Tutorial on Spoken Dialogue Systems, ACL ‘99
20
DM: Motivating Examples (Cont’d)
• Dialogue 2:
S: How may I help you? U: When is Saving Private Ryan playing?S: For what theater?U: The Paramount theater.S: Saving Private Ryan is not playing at the Paramount theater, but it’s playing at the Madison theater at 3:00, 5:30, 8:00, and 10:30.
From Caroenter and Chu-Carroll, Tutorial on Spoken Dialogue Systems, ACL ‘99
21
DM: Motivating Examples (Cont’d)
• Dialogue 3:
S: How may I help you? U: When is Saving Private Ryan playing?S: For what theater?U: The Paramount theater.S: Saving Private Ryan is playing at the Fairmont theater at 6:00 and 8:30.U: I wanted to know about the Paramount theater, not the Fairmont theater.S: Saving Private Ryan is not playing at the Paramount theater, but it’s playing at the Madison theater at 3:00, 5:30, 8:00, and 10:30.
From Caroenter and Chu-Carroll, Tutorial on Spoken Dialogue Systems, ACL ‘99
22
Comparison of Sample Dialogues
• Dialogue 1:– System-initiative– Implicit
confirmation– Merely informs
user of failed query– Mechanical– Least efficient
• Dialogue 2:– Mixed-initiative– No confirmation– Suggests
alternative when query fails
– More natural– Most efficient
• Dialogue 3:– Mixed-initiative– No confirmation– Suggests
alternative when query fails
– More natural– Moderately
efficient
From Caroenter and Chu-Carroll, Tutorial on Spoken Dialogue Systems, ACL ‘99
23
• Controls flow of dialogue– Openings, Closings, Politeness, Clarification,Initiative
– Link interface to backend systems
• Mechanisms: increasing flexibility, complexity– Finite-state
– Template-based
– Agent-based
• Plan inference
• Theorem proving
• Rational agency
• Acquisition– Hand-coding, probabilistic dialogue grammars, automata, HMMs
Dialogue Management
24
Discourse & Dialogue: Overview
Discourse and dialogue
– Discourse interpretation
– Dialogue management
• Dialogue evaluation
25
Dialogue Evaluation
• System-initiative, explicit confirmation– better task success rate– lower WER– longer dialogues– fewer recovery subdialogues– less natural
• Mixed-initiative, no confirmation– lower task success rate– higher WER– shorter dialogues– more recovery subdialogues– more natural
From Caroenter and Chu-Carroll, Tutorial on Spoken Dialogue Systems, ACL ‘99
26
Dialogue System Evaluation
• Black box:– Task accuracy wrt solution key– Simple, but glosses over many features of interaction
• Glass box:– Component-level evaluation:
• E.g. Word/Concept Accuracy, Task success, Turns-to-complete– More comprehensive, but Independence? Generalization?
• Performance function: – PARADISE[Walker et al]:
• Incorporates user satisfaction surveys, glass box metrics• Linear regression: relate user satisfaction, completion costs
27
Publicly Available Telephone Demos
• Nuance http://www.nuance.com/demo/index.html
– Banking: 1-650-847-7438– Travel Planning: 1-650-847-7427– Stock Quotes: 1-650-847-7423
• SpeechWorks http://www.speechworks.com/demos/demos.htm
– Banking: 1-888-729-3366– Stock Trading: 1-800-786-2571
• MIT Spoken Language Systems Laboratory http://www.sls.lcs.mit.edu/sls/whatwedo/applications.html– Travel Plans (Pegasus): 1-877-648-8255– Weather (Jupiter): 1-888-573-8255
• IBM http://www.software.ibm.com/speech/overview/business/demo.html
– Mutual Funds, Name Dialing: 1-877-VIA-VOICEFrom Caroenter and Chu-Carroll, Tutorial on Spoken Dialogue Systems, ACL ‘99