grounding in conversational systems dan bohus january 2003 dialogs on dialogs reading group carnegie...
Post on 19-Dec-2015
220 Views
Preview:
TRANSCRIPT
Grounding in Conversational Systems
Dan BohusJanuary 2003Dialogs on Dialogs Reading GroupCarnegie Mellon University
Overview Early grounding theories
Discourse Contributions - Clark & SchaeferConversational acts – Traum
A Computational Framework (Horvitz, Paek)PrinciplesSystems
Grounding in RavenClaw
Clark & Schaefer In discourse, humans collaborate to
establish/maintain mutual ground Discourse is structured in contributions
Contribution : Presentation + Acceptance Grounding criterion:
“A and B mutually believe that the partners have understood what A said to a criterion sufficient for the current purposes”
Clark & Schaefer (2) Evidence of understanding:
DisplayDemonstrationAcknowledgement Initiating the next relevant contributionContinued attention
Display/Demonstration order challenged…
Clark & Schaefer (3) Infinite recursion avoided by Strength of
Evidence Principle 4 possible states of non-understading
L did not notice S’s utteranceL notices it but does not hear it correctlyL hears it correctly but does not understand itL understands it
Traum Conversational acts, extension of speech
acts theoryTurn-takingGrounding
Initiate, Continue, Cancel, ReqAck, Ack, ReqRepair, Repair
Core speech actsArgumentational acts
Eliminates infinite recursion by: ack.s don’t need further ack.s
Traum (2) Later work, the following computational model
is introduced:
Finally, Brennan (& Clark) another computational formulation; studies the different types of grounding behaviors
in different media
)(
))()(()()(
C
GGGCU
Criticisms These models are by-and-large descriptive.
Can’t be used to determine what’s the next best thing to do to achieve the grounding criterion.
Moreover, they don’t describe quantitatively/make use of the uncertainty in contributions
Are insensitive to differences in channels, content, populations, etc…
Cannot be used for guidance Decision Theory to the rescue ! ! !
Decision Theory Action under uncertainty Given a set of states S = {s}, evidence e,
and a set of actions A = {a}, if:P(s|e) – is a probabilistic model of the state
conditioned on the evidenceU(a,s) = the utility of taking action a when in
state s. Take action that maximizes the expected
utility:EU(a|e) = S U(a,s)*p(s|e)
Conversation under Uncertainty Conversation = action under uncertainty Example: I want to fly to Pittsburgh …
States = {grounded, not_grounded} Unaccessible, but describable by a probabilistic
model P(g | e) = P(Pittsburgh | e) … confidence annot.
Actions = {explicit_confirm, implicit_confirm, continue_dialog}
Utilities: U(ec,g) < U(ic,g) < U(cd,g) U(ec,ng) > U(ic,ng) > U(cd,ng)
I want to fly to Pittsburgh (2)
States: NotGrounded (ng) Grounded (g)
Actions: ExplicitConfirm (ec) ImplicitConfirm (ic) ContinueDialog (cd)
Utilities: U(ec,g) < U(ic,g) < U(cd,g) U(ec,ng) > U(ic,ng) > U(cd,ng)
ng g
ec
ic
cd
t1 t2
Overview Early grounding theories
Discourse Contributions - Clark & Schaefer Conversational acts – Traum
A Computational Framework (Horvitz, Paek) Principles Systems
DeepListener Bayesian Receptionist (Quartet architecture) Presenter (Quartet architecture)
Grounding in RavenClaw
DeepListener - Domain Domain
Provides spoken command-and-control functionality for LookOut
Respond to offers of assistance (Yes/No) Small domain, but illustrates the core
ideas very well
DeepListener - States States: 5 possible “intentions” of the user
Acknowledgement Negation Reflection Unrecognized Signal No Signal
State model P(S|E) – temporal bayesian network. E = User’s Actions, Content, ASR Results and
Reliability + at time -1
DeepListener - Actions Actions:
Execute the serviceRepeatNote a hesitation and try againWas that meant for me?Try to get the user’s attentionApologize for the interruption and forego the
serviceTroubleshoot the overall dialog
DeepListener - Utilities Utilities
Elicited through psychological experimentsElicited through slidebarsWorks when you have 2, 3 grounding actions,
and a clear/small state-space design, but how about when the problem gets more complex ?
Example (paper)
Bayesian Receptionist, Presenter Bayesian Receptionist – performs the
tasks of a receptionist at a MS front desk “I’m here to see Rashid”“Bathroom?”“Beam me to 25 please”… 32 goals
Presenter – command & control interface to PowerPoint presentations.
Both based on Quartet architecture
Quartet Uses DT and BN to ensure grounding at 4
different levels:SignalChannel IntentionConversation
The actual DM task is encapsulated in the same framework at the Intention levelDifferent domains = different intention levels
Quartet – Signal & Channel At each level infer a distribution over
possible states. Key variables:Signal level – signal identified (low/med/hi)Channel level –user’s focus of attention
Maintenance module integrates Signal & Channel levels -> Maintenance Status:Channel x Signal: NoChannel, NoSignal,
ChannelButNoSignal, SignalButNoChannel, Signal
Quartet – Intention Level Domain is mostly goal inference Hierarchical decomposition on levels,
where lower levels refine the goals into more specific needs
Use BN to model p(goal | e) at leach levelPsychological studies to identify key variables
and utilities Visual cues Linguistic variables; both syntactic and semantic
Quartet – Intention Level To move between levels, compare probability of
goal to… p-progress
(above: do it)
p-guess (above: search confirmation) (below: search more info via VOI)
p-backtrack used on return nodes
Use Value-Of-Information analysis to infer what’s the variable that should be queried next.
Comments on Intention level What is the size of the learning problem?
(How many BN needed?) How much data needed for training?
Not very clear :how to deal with attribute/value, with rich
ranges (e.g. which bus station ?)how to deal with basically richer dialog
mechanisms (beyond C&C applications) focus shifts, mixed initiative providing help
Quartet – Conversation Level See image. Use Intention and
Maintenance Status to infer:Grounding: diagnoses mutual understanding
Okay, ChannelFailure, IntentionFailure, ConversationFailure
Activity goal: measures if the user is engaged or not in an activity with the system
Compute expected utility for each action (utilities elicited through psychological studies)
Bayesian Receptionist, Presenter Runtime behavior (section 3) Presenter
The Signal & Channel level allow a uniform treatment in the same framework of continuous listening
Experiments show that it’s better than random, but significantly less so than humans
But then again, the experiments were not very fair, being performed only at that level (i.e. no engaging in dialog allowed)
My Research … Deal with misunderstandings Use probabilistic modeling and decision
theory to make grounding decisions (but not task decisions)
I want a room tomorrow morning (0.73)States: time correctly understood/notGrounding Actions: no_action, expl_conf,
impl_conf, rejectUtilities: try to learn them by relating the
actions to an overall dialog/grounding metric
RavenClaw: Dialog Task / Grounding RoomLine
Login RoomLine
GetQuery
Bye
ExecuteQuery DiscussResults
Dialog Task
Grounding Model
Grounding Level
Strategies/Grounding Actions
Optimal action
State/howwell are thingsgoing
States and Actions Actions Strategies.xls States (have to keep it small!!!)
Single “state-space” model What are the variables? Which are observable and
which are stochastically modeled? Multiple “state-space” models
First 5 strategies: state = amount of grounding on each concept
What should state be for the rest? What are the indicators? Which are fully observable and which are not?
How to combine decisions from different spaces
Utilities Learn them! How ?
Idea 1: POMDPs, maybe this small they are tractable
Idea 2: Regression to some overall dialog metric
What should that be? (hmm) amount of non-null grounding actions taken …
…
top related