2013 speech tek - alphanumeric recognition discussion

15
© 2002 2012 Versay Solutions, LLC. All rights reserved. Alphanumeric Speech Recognition SpeechTek August 19, 2013 Crispin Reedy

Upload: crispin-reedy

Post on 08-Jul-2015

380 views

Category:

Technology


4 download

DESCRIPTION

This morning's discussion on Alphanumeric Reco was great. Here are the slides for anyone who is interested. Thanks to all for sharing their experiences!

TRANSCRIPT

Page 1: 2013 Speech TEK - Alphanumeric Recognition Discussion

© 2002 – 2012 Versay Solutions, LLC. All rights reserved.

Alphanumeric Speech Recognition

SpeechTek

August 19, 2013

Crispin Reedy

Page 2: 2013 Speech TEK - Alphanumeric Recognition Discussion

“The fault, dear Brutus, is not in our stars, but in ourselves”

-- Julius Caesar, Act I, scene ii

2

The Problem With Alphanumerics

Page 3: 2013 Speech TEK - Alphanumeric Recognition Discussion

© 2002 – 2012 Versay Solutions, LLC. All rights reserved.

The Need

• Account Numbers

• Policy Numbers

• Spelling out names and addresses

• Special cases

– VIN, Canadian Postal Code

• And more…

3

Page 4: 2013 Speech TEK - Alphanumeric Recognition Discussion

© 2002 – 2012 Versay Solutions, LLC. All rights reserved.

Methods for Addressing

• Project Tactics

• Limit the grammar

– Constraint List

– N-Best + Back-End Data Validation

• Confirmation

• Prefiller

4

Page 5: 2013 Speech TEK - Alphanumeric Recognition Discussion

© 2002 – 2012 Versay Solutions, LLC. All rights reserved.

Project Tactics

• Can you avoid it?

– Phone number / SSN / Zip / DOB?

• Set expectations

– Not always easy!

• Describe the problem

• What tools do you have available?

– Constraints / patterns?

– Back-end data source available?

• Can you run a proof of concept / experiment?5

Page 6: 2013 Speech TEK - Alphanumeric Recognition Discussion

© 2002 – 2012 Versay Solutions, LLC. All rights reserved.

Constraints and Patterns

• Does the number have any known pattern that can be used to limit possible values (and thereby improve recognition)– For example:

• First character is always A

• First three characters are always numbers

• Last characters are always C, G or T.

• If the answer is “no,” consider doing your own analysis.– Even if you don’t think there is a pattern, there

may be one.6

Page 7: 2013 Speech TEK - Alphanumeric Recognition Discussion

© 2002 – 2012 Versay Solutions, LLC. All rights reserved.

Applying Constraints

• Writing grammar specifically for the pattern

– How complicated is it?

• Applying a constraint list.

– How big is it?

7

Page 8: 2013 Speech TEK - Alphanumeric Recognition Discussion

© 2002 – 2012 Versay Solutions, LLC. All rights reserved.

Using nBest + Back-End Data

• Collect using an unconstrained grammar

• Set your recognizer to return an nBest list.

• Use a webservice / back end data dip to determine which ones are “real.”

• Confirm the first “real” one on the list

– Throw out the ones that are not real.

• If no, confirm the second “real” one on the list.

– Potentially collect again after that.8

Page 9: 2013 Speech TEK - Alphanumeric Recognition Discussion

© 2002 – 2012 Versay Solutions, LLC. All rights reserved.

Confirmation Strategy

• PROTIP: Phonemes that are difficult for the recognizer to hear … are also difficult for humans to hear when they are spoken back.

• Confirm using letter names for easily confusable alphanumerics.

– “You said 8, 2, 7 G as in George, B as in Boy, 9. Is that right?”

9

Page 10: 2013 Speech TEK - Alphanumeric Recognition Discussion

© 2002 – 2012 Versay Solutions, LLC. All rights reserved.

What About Letter Names?

• Yes with caveats:– Do you have a special domain that would allow

you to teach the caller letter names?

– Letter names invented by the caller will be quite variable. • Some of the “oddballs” will never be recognized

– If letter names are used during confirmation, and the utterance is re-collected, the caller may tend to use those letter names during the second collection. • So add them.

10

Page 11: 2013 Speech TEK - Alphanumeric Recognition Discussion

© 2002 – 2012 Versay Solutions, LLC. All rights reserved.

What About Letter Names?

• Yes, because:

– Longer utterances “B as in Boy” are not likely to generate false acceptance between shorter utterances such as “G” “T” etc.

• Make them separate rules so they can be weighted

11

Page 12: 2013 Speech TEK - Alphanumeric Recognition Discussion

© 2002 – 2012 Versay Solutions, LLC. All rights reserved.

Using Prefiller

• “The account number is… B Z 3 9 0”

– Noticeable improvement in recognition of first letter

– Caller may spontaneously offer

– Consider teaching the caller to say the prefiller

• Especially if you have repeat callers

12

Page 13: 2013 Speech TEK - Alphanumeric Recognition Discussion

© 2002 – 2012 Versay Solutions, LLC. All rights reserved.

Other Suggestions

• Look at speech recognition parameters that are not directly related to alphanumeric

– Are callers calling from a very noisy environment?

• Adjust overall speech threshold

– Timing of utterance collection?

• Listen to recording of utterances to make sure everything is getting collected

13

Page 14: 2013 Speech TEK - Alphanumeric Recognition Discussion

© 2002 – 2012 Versay Solutions, LLC. All rights reserved.

Specific Cases

• VIN

– Has specific pattern, but different for each manufacturer

– 16 digits: nobody will want to re-enter if you get it wrong.

14

Page 15: 2013 Speech TEK - Alphanumeric Recognition Discussion

IT DEPENDS!

15

but which way is “the best?”