multimodal communication in the staging virtual farm

Multimodal Communication in the Staging Virtual farm

Patrizia Paggio and Bart Jongejan

Center for Sprogteknologi

MUMIN workshopHelsinki 2002

Paggio and Jongejan - Helsinki ‘02 2

The Staging project (www.staging.dk)

Interdisciplinary Danish project: nature and use of 3D applications populated with autonomous agents.

CST’s work: multimodal communication components of a 3D virtual farm.

Focus: multimodal integration, mixed-initiative dialogue, interaction between dialogue and other behaviours.

The Staging VEThe VE is in charge of simulating the world provides the agents with sensory information processes requests from the agents (move

objects, produce sounds, play animations)

Staging VE developed at CVMT (Aalborg University)CST has developed a mock-up for testing purposes.

Agents

Agents carry out behaviours in reaction to external stimuli and according to

their inner state (hunger, tiredness…) based on strength of activation level

Engaging in a dialogue with the user’s avatar is also a behaviour.

Dialogue behaviour has strong degree of activation for the farmer agent.

The Aalborg VE

The CST farm

Her skal vises et billede af vores VE

Multimodal communicationUser can interact with agents via various devices:

microphone, keyboard, touch screen, data glove.

Commercial speech technology, dedicated gesture recogniser (Karin Husballe Munk at CVMT).

Speech can be combined with deictic, iconic and turn-taking gestures (Cassell and Prevost 1996). Gestures and speech merged by multimodal parser.

Multimodal integration

Speech recognition

Parsing

Semantic mapping

Communication management

Chart initialisation

SpeechHand movements

Gesture recognition

Action

pointing, size

turn-taking

More integrationGesture and word are paired:

Feed that cow$1|cowGesture adds information to lexicon entry.

Word and gesture must be (nearly) synchronous Syntactic constraints:

deictic (pointing) requires noun or pronoun iconic (size) requires noun

Semantic constraints: semantic types must be compatible

Example

Feed that cow$1|animal.pointgesture := <object-type>$<internal-id>

{act=request, predicate=feed, arg3={reln=animal, semtype=animal, objectid=cow$1}} reln and object type unified, semtype compatible,

objectid added.

Contradiction example

Feed that cow$1|apple.{act=request, predicate=feed, arg3={reln=animal, semtype=animal, objectid=cow$1}} gesture and noun semantic types incompatible; only the

interpretation provided by the gesture is compatible with the semantics of the predicate and survives.

Examples

Deictic gesturesU: Feed an animal, please. A: Which animal shall I feed?U: Take that cow (+ pointing)

Iconic gesturesU: Feed the sheep, please. A: Which food shall I take?U: The small apple (+ size)

Turn-giving and taking gesturesU: Hi (+give turn) A: Shall we...

The Communication Manager

Interprets user’s dialogue moves Builds dialogue trees Interprets references not resolved by gestures Decides agent’s dialogue moves based on

preceding dialogue and on changes in the VE

Dialogue goals arising from scenario combined with dialogue obligations created by preceding dialogue.

Dialogue goals

Dialogue goals are created based on domain-specific action templates (Badler et al 1999).

A template specifies actions with related semantic arguments, corresponding attribute name in the semantic representation, necessary preconditions.

FeedAction(Topic=Feed, Animal=<arg3>,

Food=<arg2>, Tool=<instr>,

Precondition=Hungry(Animal))

Example: feed action

U: Hi come hereA: Okay, I’ll do itU: Feed an animal.A: Which animal shall I take?U: That cow$1|cow.A: Which food shall I take?U: (Take) a small$|small apple.A: Which tool shall I take?U: Take the pitchfork.A: Okay, I’ll do it.

Example: precondition not met

U: Give that brown cow$2|cow an apple, please....A: The cow is not hungry.

Example: agent initiative

A: Shall I feed the brown cows and the sheep?U: Yes, give the animals a carrot.A: Which tool shall I take?U: The pitchfork.A: Okay, I’ll do it.

Dialogue obligations

Set of condition/obligation pairs model valid speech act sequences.

E.g.: Request/Accept, RejectWhque/Answer, Inform

Used to produce a correct reaction to a user move interpret a user move as either closing a

dialogue segment or opening a new one

Dialogue trees

request U: Give the white cow an apple please. whque A: Which tool shall I use?

whque U: Where is the pitchfork?inform A: The pitchfork is in front of

the tree. request U: Take the pitchfork then.accept A: Okay, I’ll do it.

Relaxing the rulesCondition/obligation pairs do not always fit.Speech acts can be implied:

A: HiU: Feed the animals please

They can be coerced:

U: Feed an animal.A: Which animal shall I take?U: Feed the brown cow then.

Conclusions

Staging has made an initial attempt at giving an agent multimodal dialogue abilities to allow for mixed-initiative dialogues.

Future research: more advanced gesture recognition better understanding of how gestures and

speech can complement each other repairs and self-repairs interaction between dialogue and other

behaviours

multimodal communication in the staging virtual farm

Documents

staging voolf

virtual staging | virtual furniture| home staging | property...

staging inquiry: monster culture. staging inquiry: monster...

multimodal transformer for unaligned multimodal language...

tmn staging

workshop programme multimodal corpora from multimodal ......

multimodal summarization with guidance of multimodal...

alfabetización multimodal: usos y posibilidades multimodal

mediastinal staging · samer kanaan, m.d. overview ¾...

cort staging

staging equipment

staging presentation

lazzoni staging+

home staging

reoperation rates in tis, t1 and t2 breast cancer after...

staging transformations for multimodal web interaction...

staging tips

fiata and multimodal coridors - unece · pdf filemode or...

data staging

[ppt]what is t,n,m staging and summary staging? staging...