higgins a spoken dialogue system for investigating error handling techniques jens edlund, gabriel...

HIGGINSA spoken dialogue system for investigating error handling techniques

Jens Edlund, Gabriel Skantze and Rolf Carlson

ScenarioUser: I want to go to the closest subway station.

System: Ok, to the closest subway station. Can you describe where you are now?

User: I have an ATM to my left and a pedestrian crossing in front of me.

System: Can you see some trees to your right and a white building in front of you?

User: Yes

System: Ok, take left after the large building which you have on your left and follow the street until you reach a crossing.

User: Ok, there is a bus station here.

System: That’s right. Take left again after the bus station.

Centre forSpeech Technology

TT

User utterance

User reaction/repair

Assume understanding

No recovery

Non-understanding

Assume understanding

Architecture

The Higgins Project• Instigated in 2003• Theoretical goal: Investigate error handling techniques for

collaborative dialogue systems• Practical goal: Build a system in which these can be tested

empirically• This poster presents the current stage of the project.

Error recovery(Non-understanding)

Error recovery•Map-task-like studies on human-human conversation using ASR in one direction:

•Results show that humans tend not to signal non-understanding:

•This leads to•Increased experience of task success•Faster recovery from non-understanding

•Skantze, G. (2003). Exploring human error handling strategies: implications for spoken dialogue systems.

Early error detection

Grounding

Late error detection

Error recovery(Misunderstanding)

Late error detectionThe need for late error detection is task dependent:• Sometimes not necessary:

• Sometimes reference handing is sufficient:

• For slots with multiple possible values, late error detection is necessary:

(These also exemplify misunderstanding error recovery.)

Grounding• The amount of feedback from the system should at

least depend on• Confidence of understanding• Consequence of misunderstanding

• The discourse modeller• Unifies assertions and tracks referents• Solves ellipses • Solves anaphora• Keeps track of who contributed which

information:

Early error detection

KTH LVSCR

Large-Vocabulary Probabilistic ASR

Machine-learned error detection

Rule-driven semantic/syntactic error detection

Rule-driven discourse error detection

• Which features could be used for detecting word level errors• How are they operationalised?• Initial tests with Memory-based and Transformation-based

learning suggest:• Utterance context• Lexical information• Word confidences• Discourse history

• Skantze, G. & Edlund, J. (2004). Early error detection on word level.

ASR post-processing

PICKERING:Robust

interpretation

• Rule-based semantic parsing• Finds partial results with largest coverage• Allows insertions inside phrases• Allows non-agreement if necessary

• Evaluation results show robustness against inserted content words

• Skantze, G. & Edlund, J. (2004). Robust interpretation in the Higgins spoken dialogue system.

ASR

Utterance interpretation

Discourse modelling

Generation Decision making

TTS

• Distributed modular system• Goals:

• A module for every task that is reasonably well-defined

• Separation of the domain specific (XML) and the domain independent (module code)

• Incremental processing allows for:• Rapid feedback• Flexible turn-taking• Faster processing

U1: I want to go to BostonS1: To London...U2: No, to Boston!

U1: How much is the big apartment?S1: The small apartment is […]U2a: No, the big apartment!U2b: And the big apartment?

U1: I have a large building on my leftS1: A large building on your rightU2a: No, on my left!U2b: And on my left

Misunderstanding

U1: There is a large red buildingS2: What material is the large building made

of?

O1: Do you see a wooden house in front of you?U1: YES CROSSING ADDRESS NOW

(I pass the wooden house now)O2: Can you see a restaurant sign?

Vocoder

User Operator

Listens Speaks

ReadsSpeaks ASR

GALATEA:Discourse modelling

higgins a spoken dialogue system for investigating error handling techniques jens edlund, gabriel...

Documents

nonunderstanding skantze

understanding architecture

modular system goals

spoken dialogue systems

bus station

robust interpretation

closest subway station

gabriel skantze