trindikit: a toolkit for flexible dialogue systems staffan larsson kyoto, japan 2003

68
TrindiKit: A Toolkit for Flexible Dialogue Systems Staffan Larsson Kyoto, Japan 2003

Upload: camilla-anthony

Post on 01-Jan-2016

218 views

Category:

Documents


1 download

TRANSCRIPT

TrindiKit: A Toolkit for Flexible Dialogue Systems

Staffan Larsson Kyoto, Japan

2003

What is TrindiKit?

• a toolkit for – building and experimenting with dialogue move engines and systems, – based on the information state approach

• not a dialogue system in itself

• architecture & concepts• what’s in TrindiKit?• building a system• extending TrindiKit• feature and advantages of TrindiKit• a sample system: GoDiS

This lecture

Architecture & concepts

module1 module…

Total Information State (TIS)•Information state proper (IS)•Module Interface Variables•Resource Interface Variables

resource1

control

modulei modulej module… modulen

resource… resourcem

DME

• an abstract data structure (record, DRS, set, stack etc.)

• accessed by modules using conditions and operations

• the Total Information State (TIS) includes– Information State proper (IS)– Module Interface variables– Resource Interface variables

Information State (IS)

• module or group of modules responsible for – updating the IS based on observed moves– selecting moves to be performed

• dialogue moves are associated with IS updates using IS update rules– there are also update rules no directly associated with any

move (e.g. for reasoning and planning)

• update rules: rules for updating the TIS– rule name and class– preconditon list: conditions on TIS– effect list: operations on TIS

• update rules are coordinated by update algorithms

Dialogue Move Engine (DME)

• Modules (dialogue move engine, input, interpretation, generation, output etc.) – access the information state– no direct communication between modules

• only via module interface variables in TIS• modules don’t have to know anything about other modules• increases modularity, reusability, reconfigurability

– may interact with user or external processes

• Resources (device interface, lexicons, domain knowledge etc.)– hooked up to the information state (TIS) – accessed by modules– defined as object of some type (e.g. ”lexicon”)

Modules and resources

What’s in TrindiKit?

What does TrindiKit provide?• High-level formalism and interpreter for

implementing dialogue systems– promotes transparency, reusability, plug-and-play,

etc.– allows implementation and comparison of dialogue

theories – hides low-level software engineering issues

• GUI, WWW-demo • Ready-made modules and resources

– speech– interfaces to databases, devices, etc.– reasoning, planning

• a library of datatype definitions (records, DRSs, sets, stacks etc.)– user extendible

• a language for writing information state update rules

• GUI: methods and tools for visualising the information state

• debugging facilities– typechecking– logs of communication modules-TIS– etc.

TrindiKit contents (1)

• A language for defining update algorithms used by TrindiKit modules to coordinate update rule application

• A language for defining basic control structure, to coordinate modules

• A library of basic ready-made modules for input/output, interpretation, generation etc.;

• A library of ready-made resources and resource interfaces, e.g. to hook up databases, domain knowledge, devices etc.

TrindiKit contents (2)

Special modules and resources included with TrindiKit

• OAA interface resource– enables interaction with existing software

and languages other than Prolog

• Speech recognition and synthesis modules– TrindiKit shells for off-the-shelf products,

e.g. Nuance

• Possible future modules:– planning and reasoning modules– multimodal input and output

Asynchronous TrindiKit

• Internal communication uses either– OAA (Open Agent Architecture) from SRI, or– AE (Agent Environment), a stripped-down

version of OAA, implemented for TrindiKit

• enables asynchronous dialogue management– e.g.: system can listen and interpret, plan

the dialogue, and talk at the same time

How to build a system

TrindiKitinformation state approach

How to use TrindiKit

• We start from TrindiKit– Implements the information state

approach– Takes care of low-level programming:

dataflow, datastructures etc.

TrindiKit

basicdialogue theory

basic system

information state approach

How to build a basic system

• Formulate a basic dialogue theory – Information state– Dialogue moves– Update rules

• Add appropriate modules (speech recognition etc)

TrindiKit

basicdialogue theory

basic system

information state approach

genre-specific theoryadditions

genre-specific system

How to build a genre-specific system

• Add genre-dependent IS components, moves and rules

TrindiKit

basicdialogue theory

domain & languageresources

basic system

application

information state approach

genre-specific theoryadditions

genre-specific system

How to build an application

• Add application-specific resources

• Come up with a nice theory of dialogue• Formalise the theory, i.e. decide on

– Type of information state (DRS, record, set of propositions, frame, ...)

– A set of dialogue moves– Information state update rules, including

rules for integrating and selecting moves– DME Module algorithm(s) and basic control

algorithm – any extra datatypes (e.g. for semantics:

proposition, question, etc.)

Building a domain-independent Dialogue Move Engine

Specifying Infostate type

• the Total Information State contains a number of Information State Variables

• IS, the Information State ”proper”– Interface Variables

• used for communication between modules

– Resource Variables• used for hooking up resources to the TIS, thus

making them accessible from to modules

• use prespecified or new datatypes

Specifying a set of moves

• amounts to specifying objects of type move (a reserved type)– there may be type constraints on the

arguments of moves• Example: GoDiS dialogue moves

– Ask(Q), Q is a question– Answer(A), A is an answer (proposition or fragment)– Request(), is an action– Confirm()– Greet– Quit

Writing rules

• rule = conditions + updates– if the rule is applied to the IS and its

conditions are true, the operations will be applied to the IS

– conditions may bind variables with scope over the rule (prolog variables, with unification and backtracking)

Example: a rule from GoDiS

rule( integrateUsrAnswer, [ $/shared/lu/speaker = usr,

assoc( $/shared/lu/moves, answer(R), false ),

fst( $/shared/qud, Q ),

$domain : relevant_answer( Q, R ),

$domain : reduce(Q, R, P)

], [

set_assoc( /shared/lu/moves, answer(R),true),

shared/qud := $$pop( $/shared/qud ),

add( /shared/com, P ) ] ).

Building modules

• Algorithm– For DME modules: coordinate update rules– For control modules: coordinate other

modules

• TrindiKit includes a language for writing algorithms– For DME modules: basic imperative

programming constructs– For control module: basic imperative

constructs plus asynchronous triggers

Sample update algorithmgrounding, if $latest_speaker == sys then

try integrate, try database, repeat downdate_agenda, store

else repeat

integrate orelse accommodate orelse find_plan orelse

if (empty ( $/private/agenda ) then manage_plan else downdate_agenda

repeat downdate_agendaif empty($/private/agenda))then repeat manage_plan

repeat refill_agendarepeat store_nim try downdate_qud

Sample control algorithm (2)input: {

init => input:display_prompt, new_data(user_input) => input }

| interpretation: { import interpret, condition(is_set(input)) => [ interpret, print_state ] }

| dme: { import update, import select, init => [ select ], condition(not empty(latest_moves)) => [

update, if $latest_speaker == usr then select

] }

| generation: { condition(is_set(next_moves)) => generate }

| output: { condition(is_set(output)) => output } )).

From DME to dialogue system

Build or select from existing components: • Modules, e.g.

– input– interpretation– generation– output

• Still domain independent• the choice of modules determines e.g.

the format of the grammar and lexicon

Domain-specific system

Build or select from existing components:• Resources, e.g.

– domain (device/database) interface– dialog-related domain knowledge, e.g. plan libraries

etc.– grammars, lexicons

• Example resources: GoDiS VCR control – VCR interface– Domain knowledge– Lexicon

Extending TrindiKit

You can add

• Datatypes– Whatever you need

• Modules– e.g. General interfaces to speech

recognizers and synthesizers

• Resources– E.g. General interfaces to (passive) devices

• Important that all things added are reasonably general, so they can be reused in other systsems

Datatype definitions

• relations– relations between objects; true or false

• functions– functions from arguments to result

• selectors – selects an object embedded in another

object

• Operations– Changes the information state

Building modules• DME modules

– Specific to a certain theory of dialogue management

– Best implemented using rules and algorithms

• Other modules– Should be more general, less specific to certain

theory of dialogue management– May be easier to implement directly in prolog or

other language• TrindiKit algorithm language currently only covers

checking and updating the infostate• These modules may also need to interact with other

programs or devices

Building resources

• Resource– the resource itself; exports a set of predicates

• Resource interface– defines the resource as a datatype T, i.e. in terms of

relations, functions and operations

• Resource interface variable– a TIS variable whose value is an object of the type T

• By changing the value of the variable, resources can be switched dynamically– change laguage– change domain

Features and advantages of TrindiKit

• explicit information state datastructure – makes systems more transparent – enable e.g. context sensitive interpretation,

distributed decision making, asynchronous interaction

• update rules – provide an intuitive way of formalising

theories in a way which can be used by a system

– represent domain-independent dialogue management strategies

TrindiKit features

TrindiKit features cont’d

• resources– represent domain-specific knowledge– can be switched dynamically

• e.g. switching language on-line in GoDiS

• modular architecture promotes reuse– basic system -> genre-specific systems– genre-specific system -> applications

Theoretical advantages of TrindiKit

• theory-independent– allows implementation and comparison of

competing theories– promotes exploration of middle ground

between simplistic and very complex theories of dialogue

• intuitive formalisation and implementation of dialogue theories– the implementation is close to the theory

Practical advantages of TrindiKit

• promotes reuse and reconfigurability on multiple levels

• general solutions to general phenomena enables rapid prototyping of applications

• allows dealing with more complex dialogue phenomena not handled by current commercial systems

availability

• TrindiKit website– www.ling.gu.se/projects/trindi/trindikit

• SourceForge project– development versions available– developer community?

• licensed under GPL• more info in

– Larsson & Traum: NLE Special Issue on Best Practice in Dialogue Systems Design, 2000

– TrindiKit manual (available from website)

A sample TrindiKit system: GoDiS

Research goals with GoDiS

• explore and implement issue-based dialogue management– starting from Ginzburg’s theory of dialogue semantics based

on notion of QUD (Questions Under Discussion)– adapt to dialogue system (GoDiS) and implement– extend theory coverage, taking in relevant theories

• general theory of dialogue– minimize effort for adapting dialogue system to new domains

• incrementally extending system to handle increasingly complex types of dialogue– clarifies relation between dialogue genres– promotes reuse of update rules

• Larsson (2002): Issue-based Dialogue Management (PhD Thesis)

GoDiS: an issue-based dialogue system

• Built using TrindiKit– Toolkit for implementing and experimenting with

dialogue systems based on the information state approach

• Explores and implements issue-based dialogue management

1. Menu based dialogue– Action-oriented dialogue, VCR application

2. Multiple tasks, information sharing between tasks 3. Feedback and grounding4. Accommodation, re-raising, clarification5. Multi-linguality & mutiple domains

TrindiKit

GoDiS

GoDiS-I GoDiS-A

TravelAgency

Auto-route

Xeroxmanual

VCRmanager

IBDM

homedevice

manager

ISapproach

genre-specific

application-specific

Issue-based dialogue management in GoDiS

• GoDiS: the basic system• GoDiS-I: inquiry-oriented dialogue

– typically, database search– implented application: Travel Agency

• dialogue as raising and addressing issues• dialogue plans to drive dialogue

– each plan associated with a ”task question”

• deals with multiple simultaneous issues• enables information sharing between

plans

inputinter-pret

TIS•IS proper•5 module interface variables•3 resource interface variablesDATABASE LEXICON DOMAIN

data-base

control

update selectgene-rate

output

lexicon domainknowledge

DME

1. Action-oriented dialogue based on menus

• GoDiS-A: adapted for the genre of action-oriented dialogue

• each plan now associated with an action or a question

• semi-automatic conversion of menus to dialogue plans

• sample domain: menu-based dialogue for VCR

inputinter-pret

TIS•IS proper•5 module interface variables•3 resource interface variablesDEVICE LEXICON DOMAIN

VCR interface

control

update selectgene-rate

output

lexicon domainknowledge

DME

VCR

Simple Menu Traversal

S> Lets see. What can I do for you? U> add programS> Okay. Add a program... Lets see. What channel do you want? U> channel oneS> Okay. channel one... What date?U> todayS> Okay. Today... What time do you want to start recording? U> six thirty S> Okay. From six thirty... What time do you want to stop

recording? S> seven thirty.S> Okay. Until seven thirty... The program has been added.

(dialog1_edit)

2. Multiple tasks and information sharing

• User can switch freely between any number of tasks– e.g. set the clock while programming the VCR

• Information collected while doing task A can be used in task B– For example, information where the user wants to travel

can be used for travel reservation, visa enquiries, hotel reservation etc.

• This is a consequence of keeping a global information state, rather than task-local states (as in e.g. VoiceXML)

• Reraising previous task or issue is explicitly indicated (more later)

Multiple tasks in VCR

S> Lets see. What can I do for you? U> add programS> Okay. Do you want to add a program?U> yesS> Lets see. What channel do you want? U> channel fiveS> ”Okay. Channel five… What date? U> set the clock to five thirtyS> Okay. set the clock… Lets see. five thirty.”U> yesS> The clock was set. S> Returning to add a program. What date? U> todayS> Okay. today.

(dialog2_edit)

Second task introduced

Explicit signalling of task sequencing (ICM)

Information sharing in TA application

S> Welcome to the travel agency! U> price information S> (…) Lets see. How do you want to travel? U> by flightS> (…) What city do you want to go to? U> parisS> (…) What city do you want to go from? U> do I need a visa ?S> (…) Lets see. What country are you from? U> swedenS> Okay. Yes, you need a Visa. S> Returning to the issue of price. Lets see. What city do

you want to go from?

Destination information used both in getting

price information and getting visa information

3. Grounding and feedback

• Grounding– making sure that the participants are percieving,

understanding, and accepting each other’s utterances

– dealing with problematic situtations where e.g. an utterance is not percieved

• Feedback moves– (short) utterances which signal grounding status of

previous utterance

• Sequencing moves– utterances which signal switching task (”returning

to…”), reraising questions (”so…”) etc.

Grounding on multiple levels

S> Lets see. What can I do for you? U> (mumble mumble)S> Sorry, I didnt hear what you said. What can I do for you?

U> siridusS> I heard you say siridus . Sorry, I dont understand. What can Ido for you?

U> channel oneS> channel one. I dont quite understand. Do you want to add aprogram or change channel?

(dialog3a_edit)

Negative perception

feeback

Positive perception

Negative understanding

Positive understanding

(word meaning)

Negative understanding

(relevance) Task clarification

U> turn on the t vS> turn on the TV… Sorry, I cannot control the TV set. Do you want to

add a program or change channel?”

U> pardonS> Sorry, I cannot control the TV set. Do you want to add a program or

change channel?U> change channelS> Okay.

(dialog3b_edit)

Rejection(negative

acceptance feedback)

Positiveunderstanding

User negative perception

System repetition

Positiveacceptance

Correcting the system

S> Lets see. What can I do for you? U> add programS> Okay. Add a program…S> Lets see. What channel do you want? U> channel sixS> channel five?U> noS> not channel five. So, What channel do you want? U> channel sixS> Okay. channel six.”

(dialog4_edit)

Interrogative feedback

Negative answer to fb.

Signals reraising (Sequencing ICM)

4. Question Accommodation

• To deal with – User giving more/less/other information than

requested– Guessing what the user wants to do

• Basic idea:– Move questions to QUD or ISSUES to adapt to

user utterances– E.g. short answers where question can be

found in the context– Automatically generates clarification

question if necessary

Task and question accommodation

S> Lets see. What can I do for you?U> channel five today

S> Alright. add a program… Lets see. Okay. channel five… today… What time do you want to start recording?

U> from six thirty to seven thirtyS> Okay. from six thirty… to seven thirty… The

program has been added. ”

(dialog6_edit)

Signals accommodation(sequencing ICM)

Task clarification

S> What can I do for you? U> channel fiveS> channel five. I dont quite understand. Do you

want to add a program or change channel?U> change channelS> Okay. change channel.S> Lets see. channel five?U> yesS> The channel was changed

(dialog7_edit)

Revision by reaccommodation and reraising

S> What can I do for you? U> add program channel fiveS> Okay. Do you want to add a program?U> yesS> Lets see. channel five?U> yesS> What date? U> channel sixS> Okay. channel six… What date?

5. Multilinguality and multiple domains

• Language and domain resources can be switched on the fly

• Does not affect the rest of the system

• Thanks to modularity!

Multilinguality

S> Lets see. What can I do for you? U> add programS> Okay. Do you want to add a program?U> yesS> Lets see. What channel do you want? U> svenskaS> Okej. S> Gå tillbaks till lägg till ett program. Vilken kanal vill du spela

in? U> kanal ettS> kanal ett , är det korrekt? S> Så, Vilken kanal vill du spela in? U> english (dialog5_edit)S> Okay...

Multiple domains

S> Welcome to the VCR manager! S> Lets see. What can I do for you? U> change domainS> Okay. change domain… Lets see. What domain do you

want? U> telephoneS> Okay. This is your automatic telephone operator… Lets see.

Please specify a function. U> callS> Okay. make a phone call.S> Lets see. Please specify the destination of the call. U> luisS> Okay. Luis… Calling

Summary: TrindiKit

• a toolkit for dialogue systems R&D• freely available to researchers• close the gap between theory and practive

of dialogue systems• theory-independent• promotes reuse and reconfigurability on

multiple levels• Enables implementation of flexible dialogue

management– e.g. Issue-based dialogue management in GoDIS

TrindiKit and VoiceXML

• VoiceXML– industry standard– form-based dialogue manager– web/telephony infrastructure– requires scripting dialogues in detail

• Theory-specific?– VoiceXML implements a specific theory of

dialogue– TrindiKit allows implementation of several

different theories of dialogue– More complex dialogue phenomena hard to deal

with in VoiceXML

TrindiKit and VoiceXML, cont’d

• Combine VoiceXML with TrindiKit?– future research area – support connecting TrindiKit to

VoiceXML infrastructure– use TrindiKit system as VoiceXML

server, dynamically building VoiceXML scripts

– convert VoiceXML specifications to e.g. GoDiS dialogue plans