universität ulm informationstechnik - dialogsysteme dirk bühler, universität ulm13.10.2004...

28
Universität Ulm Informationstechnik - Dialogsysteme Dirk Bühler, Universität Ulm 13.10.200 Towards Embedding VoiceXML Applications Through Compilation dirk.buehler @e-technik.uni- ulm.de stefan.hamerich@temic- sds.com (Harman/Becker AG)

Upload: keely-burchill

Post on 11-Dec-2015

213 views

Category:

Documents


0 download

TRANSCRIPT

Universität UlmInformationstechnik - DialogsystemeDirk Bühler, Universität Ulm13.10.200

4

Towards Embedding VoiceXML Applications Through Compilation

[email protected]

[email protected] (Harman/Becker AG)

Universität UlmInformationstechnik - DialogsystemeDirk Bühler, Universität Ulm13.10.200

4

Outline

• Introduction

• Compilation procedure

• Preliminary results

• Conclusions, Future work

Universität UlmInformationstechnik - DialogsystemeDirk Bühler, Universität Ulm13.10.200

4

Outline

• Introduction- Motivation- Problems with VXML interpretation

• Compilation procedure

• Preliminary results

• Conclusions, Future work

Universität UlmInformationstechnik - DialogsystemeDirk Bühler, Universität Ulm13.10.200

4

Application Scenario

Automotive Environment

• Closed Environment- without needs to access dynamic data via web

• Tasks: control onboard devices such as- Navigation- Radio, CD, MP3 entertainment- Telephone (dialing)

Regarding VoiceXML, a static subset based on forms is sufficient for modeling the needed functionality.

Universität UlmInformationstechnik - DialogsystemeDirk Bühler, Universität Ulm13.10.200

4

Typical Integration / Automotive

• Integration into cars currently done as- Hardware Integration- Software Integration

• Compilation: e.g. GDML (TEMIC dialog description language) C code native executable/hardware

• MTBF (mean time between failures) is an issue: typically > 10 years required

Universität UlmInformationstechnik - DialogsystemeDirk Bühler, Universität Ulm13.10.200

4

Problems with Traditional VXIs

• HTTP/file serving for documents

• VXML/ECMA parsing (SRGS?)

• ECMA interpretation Resource consumption and control are

problematic

VoiceXML Interpreter

Document Server

Implementation Platform

request document

Universität UlmInformationstechnik - DialogsystemeDirk Bühler, Universität Ulm13.10.200

4

Why the VoiceXML Standard is Relevant• Standard for dialogue development

• Need developers, e.g. external application providers

• “Buzzword” for clients

Universität UlmInformationstechnik - DialogsystemeDirk Bühler, Universität Ulm13.10.200

4

EcmaScript and its Role in VoiceXML

ECMA-262: Standardized version of JavaScript (1999)

• Dynamically typed imperative language• Expressive: records + 1st order functions = objects• Compact Profile: ECMA 327

In VoiceXML, it may be used• on the client-side • for event handling, cf. HTML,• simple calculations (e.g. check credit card

numbers)• and accessing platform objects VoiceXML specification based on EcmaScript

Universität UlmInformationstechnik - DialogsystemeDirk Bühler, Universität Ulm13.10.200

4

„Lines like these“

Universität UlmInformationstechnik - DialogsystemeDirk Bühler, Universität Ulm13.10.200

4

Outline

• Introduction

• Compilation procedure- Approach- VoiceXML example- Form interpretation algorithm

• Preliminary results

• Conclusions, Future work

Universität UlmInformationstechnik - DialogsystemeDirk Bühler, Universität Ulm13.10.200

4

Compilation Approach

• Off-line document retrieval: VoiceXML dialogs, grammars, scripts

• No dynamically created documents

• Compilation to EcmaScript - Copy script code in verbatim mode- Include VoiceXML <script> elements and

external scripts

• Adherence to Compact Profile (Try to)

• Platform integration through simple APIs for ASR and TTS

Universität UlmInformationstechnik - DialogsystemeDirk Bühler, Universität Ulm13.10.200

4

Advantages

At run-time:

• No HTTP serving

• No XML parsing (VoiceXML and grammars)

• No EcmaScript parsing Resources, control, optimization, integration Intellectual property(?)

VoiceXML Application

Document Server

Implementation Platform

request document

Universität UlmInformationstechnik - DialogsystemeDirk Bühler, Universität Ulm13.10.200

4

Translating a VoiceXML <form>

• Primary <form> components: - items: <field>, <block>, ...- <filled> handlers- <grammar> elements

• VoiceXML specifies a “dialog” data structure, corresponding to the current form.

• Idea: Generate this data structure to be processed by a separate Form Interpretation Algorithm (FIA).

Universität UlmInformationstechnik - DialogsystemeDirk Bühler, Universität Ulm13.10.200

4

Translating a <field> item

• Spec: Item value is represented as an EcmaScript variable in dialog data structure

• Shadow variables, for storing visit count, events, etc.

• Important components: - @cond, specifying arbitrary guard conditions- <grammar>, for ASR (incl. semantics)- <prompt>, for TTS and audio- event handlers

• Executable parts may be translated as methods (function objects)

Universität UlmInformationstechnik - DialogsystemeDirk Bühler, Universität Ulm13.10.200

4

Example: VoiceXML EcmaScript

// EcmaScriptdocument$.pizza_order = { grammars: [ loadGrammarFromURI("pizza.grxml") ], init: function(params$) { [...] dialog.initial1 = (undefined); dialog.initial1$ = { cond: function () { [...] return (document.doInit()); }, grammars: [], prompt: function (count$) { [...] if (true) { prompt ("May I take your order?"); } if (count$ == 2) { prompt("You can say something like: [...]"); } }, handle: function (_event, count$) { [...] if (count$ == 1 && (_event == "noinput")) { $doprompt = true; throw 'continue'; } else if (count$ == 2 && (_event == "noinput")) { initial1 = (’nothing’); $doprompt = true; throw 'continue'; } else throw _event; } };[...]

<vxml version="2.0"><form id="pizza_order"> <grammar src="pizza.grxml"/>

<initial name="initial1"

cond=” document.doInit()”>

<prompt> May I take your order? </prompt> <prompt count="2"> You can say something like: [...] </prompt>

<noinput count="1"> <reprompt/></noinput> <noinput count="2"> <assign name="initial1" expr=“’nothing’"/> <reprompt/></noinput>

</initial>[...]

Entering form “pizza_order” Activate grammar “pizza.grxml” Evaluate “initial1” Evaluate “doInit” Update “doInit$.count” Initiate prompts

System: May I take your order?User: <no input> Handle exception “noinput”

S: You can say something like: [...]U: <no input> Handle exception “noinput”

S: Would you like ......

Entering form “pizza_order” Activate grammar “pizza.grxml” Evaluate “initial1” Evaluate “doInit” Update “doInit$.count” Initiate prompts

System: May I take your order?User: <no input> Handle exception “noinput”

S: You can say something like: [...]U: <no input> Handle exception “noinput”

S: Would you like ......

Entering form pizza_order Activate grammar pizza.grxml Evaluate initial1 Evaluate document.doInit() Update doInit$.count Initiate prompts

System: May I take your order?User: <no input> Handle exception, reprompt

S: You can say something like: [...]U: <no input> Handle exception, assign initial1

S: Would you like ......

Entering form pizza_order Activate grammar pizza.grxml Evaluate initial1 Evaluate document.doInit() Update doInit$.count Initiate prompts

System: May I take your order?User: <no input> Handle exception, reprompt

S: You can say something like: [...]U: <no input> Handle exception, assign initial1

S: Would you like ......

Universität UlmInformationstechnik - DialogsystemeDirk Bühler, Universität Ulm13.10.200

4

Form Interpretation Algorithm

• Coded completely in EcmaScript

While processing a form:

• Select field to execute, or quit

• Collect item:- Play (execute) item’s prompts- Pass grammars to recognizer- Determine type of result - and handle exceptions

• Process result and execute fillers

• Goto 10

dialog data structure

item$, item$.cond()

item$.countitem$.prompt()

item$.grammarsapplication.lastresult$

item$.handle()

dialog.filled()

Universität UlmInformationstechnik - DialogsystemeDirk Bühler, Universität Ulm13.10.200

4

Outline

• Introduction

• Compilation procedure

• Preliminary results- VXML to Java- VXML on PocketPC- Limitations

• Conclusions, Future Work

Universität UlmInformationstechnik - DialogsystemeDirk Bühler, Universität Ulm13.10.200

4

Compiler

• Written in Haskell (GHC) • OS independent: Linux, Windows, Solaris, ...• VoiceXML parsing and recursive document

retrieval (about 300 Lines of code)• Compilation into text (about 300 Lines of code)

Component

Format

Size [KB]

FIA .JAR 32

Front-end .JAR 57

Rhino .JAR 600

Demo (Pizza)

.VXML

6

.JS 15

.JAR 42

Universität UlmInformationstechnik - DialogsystemeDirk Bühler, Universität Ulm13.10.200

4

Experiment 1: VoiceXML to Java

• Compilation to .class files using RHINO JavaScript interpreter (Mozilla Project)

• OS independent: Windows, Linux, ...

• Requires: JRE / Java Web Start

• Integration with FreeTTS,

• Typed input, SRGS processor written in Java

Shows VXML ECMA is feasible.

Universität UlmInformationstechnik - DialogsystemeDirk Bühler, Universität Ulm13.10.200

4

Demo: http://it.e-technik.uni-ulm.de/~buehler

Universität UlmInformationstechnik - DialogsystemeDirk Bühler, Universität Ulm13.10.200

4

Experiment 2: VoiceXML goes PocketPC• Compilation to EcmaScript

• Inclusion in HTML (“Embedding”)

• Viewing with Pocket Internet Explorer

VoiceXML interpretation on PocketPC EcmaScript interpretation (without Java)

Universität UlmInformationstechnik - DialogsystemeDirk Bühler, Universität Ulm13.10.200

4

Demo: http://it.e-technik.uni-ulm.de/~buehler

Universität UlmInformationstechnik - DialogsystemeDirk Bühler, Universität Ulm13.10.200

4

Experiment 2: Some glitches

• (Very) Limited grammar support

• FIA has to yield control, or else page will be in “Loading” state all the time (no interaction)

• Small modifications in FIA:1. Save state (form, item) and interrupt

execution.2. On user input, resume with saved state.

• Still, VoiceXML application on limited device.

Universität UlmInformationstechnik - DialogsystemeDirk Bühler, Universität Ulm13.10.200

4

Limitations

VoiceXML 2.0 subset, similar to X+V:

• Mixed-initiative forms

• In-line and external grammars

• Links, exception handling

• Limited support for subdialogs

Important missing features:

• Application / document / session handling

Glitch: application/document scoping for <script>.

Universität UlmInformationstechnik - DialogsystemeDirk Bühler, Universität Ulm13.10.200

4

Outline

• Introduction

• Compilation procedure

• Preliminary results

• Conclusions, Future Work

Universität UlmInformationstechnik - DialogsystemeDirk Bühler, Universität Ulm13.10.200

4

Conclusions

• Practical compilation of a subset of VoiceXML

• No dynamically created documents

• Easy to use demos, also for off-line use

• VoiceXML’s dependency on EcmaScript is actually advantageous

• OS-independent compilation feasible and sensible in certain environments, i.e. towards embedded systems

Universität UlmInformationstechnik - DialogsystemeDirk Bühler, Universität Ulm13.10.200

4

Future Work

• Automotive environment: - Compiler for ECMA to Assembler for

integration in existing technology (TEMIC)

- Embedded Java as environment for SW integration in future cars (BMW Car IT)

• Platform integration for ASR

• Extend coverage 1: documents, applications

• Extend coverage 2: VoiceXML 2.1 goodies, e.g. <data> for obtaining XML data without <submit>

• Simplify compilation: Perhaps XSLT?

• Useful for “traditional VXIs”: Cache?

• Go SourceForge?

Universität UlmInformationstechnik - DialogsystemeDirk Bühler, Universität Ulm13.10.200

4

Thank you!