mcl-collection and voice recognition 5 january 2007 richard fullard mcl technologies
Post on 25-Dec-2015
219 Views
Preview:
TRANSCRIPT
MCL-Collection and Voice Recognition
5 January 2007Richard FullardMCL Technologies
Why Voice on Mobile Computers?
Picking Accuracy and Efficiency !!!
DEMO
Remote Viewer from Terminal-Screen
Demonstration
The Drivers behind Voice
Increases Mobile Worker efficiency and productivity
15% minimum increase of productivity (paper to voice picking) Simply faster
Increased accuracy; 99.9% plus is not uncommon
No need to constantly ‘swap’ between paper/terminal and the picking task to perform. Don’t ever lose sight of the actual task
Allows hands-free operation
Mobile Worker can easily pick the products of the shelf, and easily move or drive the picking cage/pallet etc.
Allows eyes-free operation
Mobile Worker can focus on other activities such as safe driving
Pickers speak and listen while “on the move” (Highly developed multi-tasking skills that humans have been practicing for years)
History of Voice
Voice recognition systems have been commercially available for over 15 years but only recently ‘crossed the chasm’ (started to work well)
Early systems were Voice Independent:
A sample of the population was taken as the ‘voice mean’ against which the individual's voice profile was compared
Works quite well in white collar environments such as call centres
Does not work in Industrial environments
Works for dictating systems, Microsoft Office Products & PDA products, IBM’s Via Voice, telephone systems, etc.
History of Voice
Recently Voice-Dependent systems developed into stable and reliable solutions
An individual trains the system on only the words that are used in the application
Saves the individual’s voice profile in the system
The application compares the voice entry against the voice profile of the relevant individual on the system ensuring high accuracy of recognition
Smaller word-set, much higher accuracy
Used in industrial data capture applications
History of Voice
Up until recently, only “vertical” voice solutions available
Dedicated, proprietary devices
But things have changed!
Industry wants Multimodal, it needs efficiency, but accuracy too. It doesn’t want a compromise!
RFID when it makes sense Scan when it makes sense Key when it makes sense Speak when it makes sense Eyes and Ears when it makes sense
What is MCL-Voice
MCL-Voice is a fully standard MCL solution with the addition of Voice recognition and
Voice synthesis capabilities
MCL- LINK / MCL- NET
RS232 . Modem . GSM/GPRS . Ethernet . Internet . WiFi 802.11
MCLODBCBridge
MCLDLL Bridge
MCL SAP R/3 Bridge
Communication & Dispatching servers
MC
L –
Des
igne
r
MCL-Client
Voice
MCL …
Bridge…
MCL-Client + Vocollect Voice
Your Host Systems
What is Voice Recognition
Voice Recognition is a technology that converts analog signal coming from a microphone (Voice) into a sequence of digital bytes (words)
(Also known as “ASR” or “Speech To Text” )
“1”
Voice Signal Data Word
What is Voice Synthesis
Voice Synthesis is a technology that converts data words into analog Speech signal in a given language
(also named “Text to Speech” or “TTS”)
“1”
Voice SignalData Word
“One”
Embedded Voice Engine
Voice Recognition and Voice Synthesis is done in the Terminal (No Server required)
System Requirements
Terminal Hardware Certified by MCL Technologies Windows operating system:
Microsoft® Windows® Pocket PC 2003/CE4/CE5/WM5.
> 300MHz XScale processor. Min 64 MB memory. Audio Interface (Jack).
MCL Certified headset (with quick release connector)
MCL-Voice Client (activated)
Certified Devices / Dec 2006
Symbol Technologies Intermec
MC50
MC70
MC90xx
WT40xx
CN2B730-751B
Terminal Accessories
Quick release connector
Operator safety feature
• Quickly separates headset cable from the terminal.• High quality release mechanism supports multiple connections
and disconnections for personalized headsets.
MCL-Collection Voice
Three new MCL-Collection components:
MCL-Designer Voice Add-on (for development)• Requires an additional License to standard MCL-Designer
MCL-Client Voice (for deployment)• Requires a specific Voice Client License.
MCL-Voice Manager (for deployment)• Requires a specific product License.
Multimodal Approach
What Is Multimodal?
Definition - MCLMultimodal access.
The combination of multiple data capture technologies in one mobile worker application.
Barcode scanners. Imagers. Displays. Touch screens. Signature capture. Keyboards. Weigh scales. Printers. … and now, Voice.
What is Multimodal
Multimodal Data Capture
Voice in Multimodal Applications
Consider retail inventory application.Barcode scan item UPC or EAN.
Keyboard or voice entry of quantity.
Hands full of merchandise – voice more expedient.
Consider warehouse receiving application.Radio frequency identification of pallet.
Voice acceptance of pallet.
Voice receipt of item level goods.
Hands free operation.
Mobile worker has flexibility to use whichever input method is more convenient and efficient.
Voice Recognition
Application VocabulariesVoice Templates
Application Vocabularies
Unique vocabulary for each MCL-Voice application.
Action words used by mobile worker to:Enter transaction data.
Navigate within the application.
Control Voice Parameters
MCL-Designer builds application vocabulary word lists.Mobile worker applications typically have small vocabularies.
• 50 words very typical (+/- 120KB).• 100 words unusually large.
Application Vocabularies
MCL-Voice vocabulary word categories.
Action Words:Application words.
• Task specific words: data entry of transaction data.Such as “Quantity”, “1”, “2”, “3”, “4”.
• Navigation words: command addressed to the MCL engineSuch as “Picking” to branch to the Picking function
Global words.• Words that you define to be valid on every screen. • Commonly used for application navigation words.
Such as “Next”, “Back”, “Clear”, “Previous”, “Delete”, “Enter".
Control words.• Commands to control the voice engine.
Such as “Pause”, “Resume”, “Volume UP”, “Volume Down”.
Voice Recognition Training
Like fingerprints, everybody has a unique voice print.Different accent, different pronunciation of words.
Each individual must “train” the voice recognizer to:Understand the application vocabulary for that individual’s voice pattern.
Allow worker to speak naturally and be understood.
Training generates a “User Voice Template”
Voice Recognition Training
User voice template.The “User Voice Template” file contains:
• The selected language.• The specific User’s settings (volume, speed, pitch, sensibility).• All the application words and their “speaking image”
Voice training is performed on the mobile computer.• +/-20 minutes per worker typical to train 50 word vocabulary.
Voice training result saved in unique template for each worker.• Suggest using employee ID or badge ID to create unique,
personalized template names.• Template can be uploaded to a server and deployed to any other
device.
Voice Recognition Training
How is the voice template used?Mobile worker says a word.Voice recognizer compares spoken word against template.Spoken word translated into template word it matches.
The smaller the vocabulary.• The faster the comparison.• The more accurate the match.
Multi-User.Voice templates for several individuals may be:
• Saved on a mobile computer.– Allows a mobile computer to be shared by several workers.
• Saved and downloaded from a central server. – Allows a pool of mobile computers to be shared by all mobile workers.
Multi-User
User Voice Template File & Voice Preferences are loaded
in the device (for execution) & on server (for distribution)
UVTUVTUVTUVTUVT
UVT
UVT
Download/Upload
ServerUser 1
User 2
Multi-Users
Advanced Features
Advanced Features
Noise Cancellation on MCL Certified Headsets.MCL-Voice operates in noisy, industrial environments:
• Warehouses, distribution centers, dockyards, transportation hubs, assembly/manufacturing plants…
Very effective noise suppression on certified headsets.
High voice recognition directly proportional to ability to reduce ambient noise.
• Impossible task on inferior headsets.
To maximize voice recognition:• Perform voice recognition trainings in actual work environment.• Creates most representative voice templates.
Advanced Features
Noise sampling.Adjusts microphone pickup levels to compensate for ambient noise level.
Noise sampling can be performed:• Automatically on boot up.• On mobile worker demand.
Advanced Features
Dynamic voice training.Incremental addition of vocabulary words to existing template.
On boot up. • Untrained words introduced by new version of application.• Prompts mobile worker to train any untrained words.• Consider application vocabulary of 50 words.
– New version introduces one new word.– Mobile worker trains only the new word, not all 51 words.
On demand by mobile worker.• Recognizer continually has trouble understanding a given word.• Retrain any poorly trained words.
Advanced Features
Fast voice recognition.Explicit application vocabulary.
• Very efficient.
Performed on the mobile computer.• Voice recognition and audio feedback virtually instantaneous.
Advanced Features
Vocabulary optimization.Vocabulary subset definable on each input field.
Limits template choices valid for match.• Further decreases voice recognition search times.• Further increases word match accuracies.
Advanced Features
Audio Feedback (Echo)Any input.
Mobile worker says “five” into microphone, and immediately hears synthesized “five” in the headset.
• Immediate verification.
Multimodal approach.• Any input can be echoed:
– barcode scanned data, keyboard entry data, etc.
• MCL-Voice TTS synthesizer sees all data the same regardless of the original multimodal input source of the data.
Advanced Features
Talk OverExperienced operators may speak the data before the prompt is given.
• Prompts are canceled if operator enters a valid response.
MCL-Voice Certification
Guidelines and Best Practices
Why?
Voice recognition is not a “black and white” science.
How to position MCL-Voice. Benefits of MCL-Voice. Customer competitive advantages.
Why?
Design and implementation of a successful voice application is done by following guidelines and best practices. Technical training teaches, for example:
Avoid vocabulary items like “cherry” and “sherry” on the same data entry field.
Avoid words and phrases with common initial words, like “orange” and “orange peel”.
Avoid vocabulary words similar to environment background noise.
• Consider a work area equipped with air compressors. Avoid vocabulary words that end with “S” and “Z”, like “Pass”. The voice recognition engine might interpret the background hiss as a valid vocabulary word.
Requirements
Training.Successful completion of the MCL-Collection Voice Technical Training course by at least two representatives from your company.
Successful completion of the MCL-Collection Voice Sales Training course by at least one individual from your company.
Equipment.Purchase of an MCL-Collection Voice Demo Kit by your company.
Company Certification
Given to companies that satisfy all Certification Requirements.
Tied to both the company and the trained individuals.
Lost by a company when one or both trained individuals leave the company.
Not automatically given to a company that hires individuals with prior MCL-Collection Voice training.
Each company must satisfy all Certification Requirements to receive the MCL-Collection Voice Certification.
Company Certification
Lists your company in an online directory. Entitles your company to purchase MCL-Voice products. Develops best practices technical skills to implement
successful voice enabled applications. Provides understanding of the benefits voice brings to
mobile workforce deployments. Elicits customer confidence. Gives access to MCL Support.
Benefits
MCL-Designer
Voice Enabling a Project
MCL-DesignerVoice Enabling a Project
Voice Enabling a project
At Project level …
At Screen level …
At Screen Object level …
At Process Object level …
MCL-DesignerVoice Enabling a Project
At Project level
All settings defined at “Project Level” are used by all programs and processes of the project.
MCL-DesignerVoice Enabling a project at Project level
• Terminal Setup• Programs• Local files• System & User Variables• Image files• Fonts & Styles• Keyboard definition
• Voice Settings
MCL Project components
The “Voice settings” define the “ASR” and “TTS” parameters.
This includes the language and the speed definition, the global field words etc…
MCL-DesignerVoice Enabling a project at Project level
Voice settings
MCL-DesignerVoice Enabling a project at Project level
Settings Parameters: Speech In
Speech Out
Global Words Control Words Phonetic Substitution
Table
MCL-DesignerVoice Enabling a project at Project level
Speech Out
Text Will be spoken as ABC “alpha bravo charlie” Morning “morning” A12 “alpha one two” A 12 “alpha twelve” 25 “twenty five” 123 “one hundred twenty three” AB12D “alpha bravo one two delta”
MCL-DesignerVoice Enabling a project at Project level
Speech Out
Special symbol # pound $ dollar € euro &
ampersand * asterisk + plus
- dash . point / slash \ back slash = equals
MCL-DesignerVoice Enabling a project at Project level
The Input timers
MCL-DesignerVoice Enabling a project at Project level
Settings Parameters: Speech In
Speech Out
Global Words Control Words Phonetic Substitution
Table
MCL-DesignerVoice Enabling a project at Project level
Settings Parameters: Speech In
Speech Out
Global Words Control Words Phonetic Substitution
Table
MCL-DesignerVoice Enabling a project at Project level
Settings Parameters: Speech In
Speech Out
Global Words Control Words Phonetic Substitution
Table
Note: Words are case sensitive
MCL-DesignerVoice Enabling a project
At Screen level
All settings defined at “Screen Level” are used by all objects and processes of the selected screen.
MCL-DesignerVoice Enabling a project at Screen level
MCL « Voice » program structure
Process-In• Process-In lines
Voice object definitions
Start of screen Options• Clear Screen • Backlight• Screen Label
Process-Out• Process-Out lines
Screen• Display data,• Input fields, (with «In Screen» processes)• Buttons,• Menu• Etc…
Next screen
MCL-DesignerVoice Enabling a project at Screen level
Enabling or Disabling
Speech In & Speech Out
independently
MCL-DesignerVoice Enabling a project
Screen Object level
All settings defined at the level of a specific screen or process object are used by this object only.
MCL-DesignerVoice Enabling a project at Screen Object Level
Output Screen Objects
Display Text
Display Variable
MCL-DesignerVoice Enabling at Screen Object Level
Input Screen ObjectsInput Barcode and KeyboardInput SpinInput List
Pull Down ListCheck boxesRadio buttons
Text buttons and Image buttons
Menu Text and Menu Buttons
File Browse
MCL-DesignerVoice Enabling a project at Screen Object Level
Voice ControlInput Barcode and Keyboard,
Input Spin, Input List
Focus
Prompt
Words
Word List 1, 2 & 3
Audio Feedback
Completion
Prompt
MCL-DesignerVoice Enabling a project at Screen Object Level
Voice ControlInput Barcode and Keyboard,
Input Spin, Input List
Focus
Prompt
Words
Word List 1, 2 & 3
Audio Feedback
Completion
Prompt
MCL-DesignerVoice Enabling a project at Screen Object Level
Voice ControlInput Barcode and Keyboard,
Input Spin, Input List
Focus
Prompt
Words
Word List 1, 2 & 3
Audio Feedback
Completion
Prompt
MCL-DesignerVoice Enabling a project at Screen Object Level
Voice ControlInput Barcode and Keyboard,
Input Spin, Input List
Focus
Prompt
Words
Word List 1, 2 & 3
Audio Feedback
Completion
Prompt
MCL-DesignerVoice Enabling a project at Screen Object Level
Voice ControlPull Down List, Check box,
Radio buttons
Focus
Focus Word
Words
Word List 1, 2 & 3
Audio Feedback
MCL-DesignerVoice Enabling a project at Screen Object Level
Voice ControlPull Down List, Check box,
Radio buttons
Focus
Focus Word
Words
Word List 1, 2 & 3
Audio Feedback
MCL-DesignerVoice Enabling a project at Screen Object Level
Voice ControlPull Down List, Check box,
Radio buttons
Focus
Focus Word
Words
Word List 1, 2 & 3
Audio Feedback
MCL-DesignerVoice Enabling a project at Screen Object Level
Voice ControlText buttons, Image buttons,
Focus
Focus Word
MCL-DesignerVoice Enabling a project at Screen Object Level
Voice ControlMenu Text, Menu buttons
Focus
Prompt
Words
Word List 1, 2 & 3
Audio Feedback
MCL-DesignerVoice Enabling a project at Screen Object Level
Voice ControlMenu Text, Menu buttons
Focus
Prompt
Words
Word List 1, 2 & 3
Audio Feedback
MCL-DesignerVoice Enabling a project at Screen Object Level
Voice ControlMenu Text, Menu buttons
Focus
Prompt
Words
Word List 1, 2 & 3
Audio Feedback
MCL-DesignerVoice Enabling a project at Screen Object Level
Voice ControlFile Browse
Focus
Prompt
Words
Word List 1, 2 & 3
Audio Feedback
Completion
Prompt
MCL-DesignerVoice Enabling a project at Screen Object Level
Voice ControlFile Browse
Focus
Prompt
Words
Word List 1, 2 & 3
Audio Feedback
Completion
Prompt
MCL-DesignerVoice Enabling a project at Screen Object Level
Voice ControlFile Browse
Focus
Prompt
Words
Word List 1, 2 & 3
Audio Feedback
Completion
Prompt
MCL-Designer Voice Enabling a project
At Process Object level
All settings defined at the level of a specific screen or process object are used by this object only.
MCL-Designer Voice Enabling a project at Process Object Level
Double Click
Process-Out
Process-In
MCL-Designer Voice Enabling a project at Process Object Level
In-Screen processes
MCL-Designer Voice Enabling a project at Process Object Level
Voice Processes
Speech Input Speech Output Play Sound Wave Noise Sample Voice Training Set Voice State Set Voice Operator Set Recognizer parameters Set Synthesizer parameters
MCL-Designer Voice Enabling a project at Process Object Level
Voice Processes
Speech Input Speech Output Play Sound Wave Noise Sample Voice Training Set Voice State Set Voice Operator Set Recognizer parameters Set Synthesizer parameters
MCL-Designer Voice Enabling a project at Process Object Level
Voice Processes
Speech Input Speech Output Play Sound Wave Noise Sample Voice Training Set Voice State Set Voice Operator Set Recognizer parameters Set Synthesizer parameters
MCL-Designer Voice Enabling a project at Process Object Level
Voice Processes
Speech Input Speech Output Play Sound Wave Noise Sample Voice Training Set Voice State Set Voice Operator Set Recognizer parameters Set Synthesizer parameters
MCL-Designer Voice Enabling a project at Process Object Level
Voice Processes
Speech Input Speech Output Play Sound Wave Noise Sample Voice Training Set Voice State Set Voice Operator Set Recognizer parameters Set Synthesizer parameters
MCL-Designer Voice Enabling a project at Process Object Level
Voice Processes
Speech Input Speech Output Play Sound Wave Noise Sample Voice Training Set Voice State Set Voice Operator Set Recognizer parameters Set Synthesizer parameters
MCL-Designer Voice Enabling a project at Process Object Level
Voice Processes
Speech Input Speech Output Play Sound Wave Noise Sample Voice Training Set Voice State Set Voice Operator Set Recognizer parameters Set Synthesizer parameters
MCL-Designer Voice Enabling a project at Process Object Level
Voice Processes
Speech Input Speech Output Play Sound Wave Noise Sample Voice Training Set Voice State Set Voice Operator Set Recognizer parameters Set Synthesizer parameters
MCL-Designer Voice Enabling a project at Process Object Level
Voice Processes
Speech Input Speech Output Play Sound Wave Noise Sample Voice Training Set Voice State Set Voice Operator Set Recognizer parameters Set Synthesizer parameters
MCL Voice: Principles
MCL-Client Voice Recognizer: Principle
Voice Engine Settings
Maximum delay between Words
Default Timers• For Input Fields
NRT = Noise Rejection Timer
• For Combined WordsMWT = Multiple Word Timer
MCL-Client Voice Recognizer: Principle
Voice Recognizer Settings
• Sensitivity settings
MCL-Client Voice Recognizer: Principle
AS
R
VoiceTemplate
Best MatchWord
Score
Word List
Threshold
Data
Signal
+-
Score
MCL-Designer Voice Enabling a project at Process Object Level
Original image
Same Image?
Consider that the template of a word is like an image
This is the “template” image
This is the image with a certain level of noise
YES
MCL-Designer Voice Enabling a project at Process Object Level
Original image
Same Image?
This is the “template” image
This is the image with the same level of noise
NO
MCL-Designer Voice Enabling a project at Process Object Level
Original image
This is the “template” image
Same Image? Maybe
This is the image with a higher level of noise
MCL Voice Manager
Voice-Manager
Management & Tuning Software for Voice Users & Terminals
• Log User Voice Data
Date & time
Terminal ID
Spoken Word
Word Score
Result
Voice-Manager
Management & Tuning Software for Voice Users & Terminals
• User Score over Time
Average ScorePer Hour
Word & NoiseVolume / Per Hour
Voice-Manager
Management & Tuning Software for Voice Users & Terminals
• Statistical Analyzer
Score Distribution
Scoring SplitAnalysis
Words versus NoiseVolume
Word DetailsAnalyzer
Voice-Manager
Management & Tuning Software for Voice Users & Terminals
• Analysis & Recommendation
Suggested WordTo Re-Train
MCL Client
Installation & Activation
MCL-Client Activation
The “MCL-Voice” client will generally be installed either on the Flash memory or on an SD card
Installing on the Flash memory
Using Activesync, Install the “MCL-Voice Client” on the device using the .exe file. Activesync will create the necessary folders and install the “MCL-Voice” client
Installing on a SD card
The .zip file must be unzipped and copied on the SD card. The card will then be placed in the terminal.
MCL-Client Activation
Start MCL on the device
Define the Terminal and Subnet ID’s Go to the Activation screen Enter the License number and the Activation code
Notes: The MCL-Client with MCL-Voice does not support the
“Demo mode” The terminal activation uses the “Off-Line” activation
procedure. The MCL.key file is stored in the main folder of the MCL-
Voice client.
MCL-Client System Menu
System Menu
This screen gives access to the different options of the System Menu
The « Setup » option is used to access the « Voice » settings menu.
Q & A
top related