developing an effective wireless middleware strategy

Developing an Effective Wireless Middleware Strategy

Mobility

• Mobility is “the ability to move freely”• Wireless access to enterprises presents

many challenges:– bandwidth

– device issues

– synchronization of data

• Middleware needed to address the challenges

Wireless Access Challenges

• Bandwidth– The pervasive devices are still mobile phones

– 2.5/3G roll-out has been slow

– Networks are unreliable

• Wi-Fi Solution– 802.11b provides hot spots (range of 300 feet)

– High-speed (>= 11 Mbps)

– Access to wired network backbone


• Device Issues– Thin client

• Limited display, text only in some cases

• Battery life, memory very limited

• Open programming environments limited

– Thick client• PDA’s support 256 color displays

• Improved battery life, CPU and memory

• Standard operating systems

• Mic and speaker access


• Synchronizing Data– Replication model

• Hot-sync of data to docking station

• Static applications

• Lowers user acceptance

– Real-time access• Requires reliable network connectivity

• Allows heavy lifting to be done in-network

• Increases user acceptance

Extend the Web

• Use existing web infrastructure• From wireless device to the applications

requires middleware• Why isn’t 802.11 connectivity to web

applications enough?– User devices too difficult to enter data with

– Limited display and stylus difficult

Middleware Architecture

Audio

MultimodalGateway

Audio /Command

Intelligent Control Module(ICM) Resource Broker

Multimodal Service RuntimeEnvironment (SRE)

SIP

SIP MR

CP

Application Server

HTTP

Platform Channel

Multimodal Client /Platform

Multimodal Gateway Components

3rd Party Components

SoftServer Components

SoftServer ASR, TTS andStreaming Media Server Pools

SIP

HTTP

Middleware to Effectively Add Speech

• Speech can greatly improve the user experience

• Press to talk – user speaks input to the device

• Audio cue – prompts can be used for tutorial, accessibility, etc.

• The middleware needs to allow access to the speech resources in a web-centric way

Speech Recognition

• Input– Speech from user

– Grammar or dictionary of expected words/phrases

• Processing– Phonetic classification using acoustic models

– Pattern matching with grammar

• Output– Recognition matches and confidence

Speech Synthesis

• Input– Text/ASCII or XML markup (SSML)

– Language models

• Processing– Phonetic concatenation of audio

• Output– Audio representation of text

Speech Ecosystem

• Built from IVR– Speech has emerged as an extension of IVR

touchtone

• Hardware-Centric– Telephony switch and board vendors have

added speech technologies

• Wireless Access– Data devices such as PDA’s are NOT

connecting over a voice channel!

Web Architecture for Speech

• Host-based Technologies– The Speech Recognition and Speech Synthesis

resources are software-based.

– Can be deployed alongside web infrastructure

• Transactional Speech– Client/server interaction for speech resources

– Each transaction defined as a Web Service

– Allow the developer to add speech where appropriate

Role of the Middleware

• Resource Management– Speech Resources deployed on off-the-shelf

hardware

– Requires a common session, media and control protocol

• Open Standards– Session Initiation Protocol (SIP, RFC 3261)

– Real-time Transport Protocol (RTP, RFC 1889)

– Speech Server Control (SPEECSC, IETF WG)

Session Control

• SIP– Modeled after HTTP and SMTP

– Provides client/server network programming model

– Transported via UDP

– REGISTER message allows resources dynamic binding

– Includes Proxy functionality for load balancing

Media Transport

• RTP– Currently using in streaming video/audio web

applications

– Allows consumer/producers of audio to be connected via a network

– Transported via UDP

– Real-time packets to reduce latency and impact of lost packets

– Quality of service must be managed

Resource Control

• SPEECSC– Working group chartered after initial MRCP

Internet drafts submitted

– Defines message set to control• Speech Recognizers

• Speech Synthesizers

• Streaming Prompt Server

– Currently HTTP-style

– Web services being investigated

Component Servers for Speech

• Define each resource as a Component Server– Resources acquired thru SIP

– Media transmitted/received thru RTP

– Control thru MRCP• load grammar

• queue TTS

• play

• perform recognition

– Common resources pooled for resource management

Media Session Framework (MSF)

MSF server

MSF server

MSF server

ASREngine

TTSEngine

PromptCache

UAS

UAS

UAS

RTP

RTP

RTP

ASR client

UAC

UAC = SIP User Agent ClientUAS = SIP User Agent Server

SIP messaging

RTP media

Media Source/Target

Prompt client

TTS client

UAC

UAC

Device Access to the Component Servers

• Devices must be able to use the resources in a standards-based way

Media Gateway

Wired Phone CellPhone Handheld 2.5/3G CellPhone

Multimodal Gateway

SIP/RTPpackets

Voice Service(w/ASR,TTS)

Conclusion

• Bandwidth for wireless networks is improving

• Middleware needs to address:– resource management

– load balancing

– media resource integration

• Open standards-based approach

Thank You

Reminder:

• Please be sure to complete your session evaluation forms and place them in the box outside the room. We appreciate your feedback.

developing an effective wireless middleware strategy

Documents