cassi speech recognition

8/8/2019 CASSI Speech Recognition

1/14

CASSI Speech Recognition:

Adding Speech Recognition to Embedded Devices

by

Praveen lvv


2/14

INTRODUCTION

What is CASSI ?

Conversay Advanced Symbolic Speech Interface

It can be used in a variety of embedded systems.

It runs on either single or dual-processor hardware designs

> CASSI provides continuous, speaker-independent

speech recognition

Conversay developers and customers write application

code that uses the CASSI API to integrate speechrecognition and text-to-speech (TTS) capability into

embedded products.


3/14

What is TTS ?

Text-To-Speech (TTS):

CASSI contains two modules for performing TTS:Rosetta and a TTS synthesis module.

Rosetta, the text-to-phonetics unit, accepts

arbitrary written text as input and outputs a string ofphonemes forCASSI to synthesize

process

of incorporating speech technology

1. Definition of capabilities

2. Analysis of hardware resources

3. User interface design

4. Development


4/14

HARDWARE ENVIRONMENT:

Modular nature.

Suitable for a variety of systems.

Used with single processor designs where one

processor handles all component execution.

Feature extraction and TTS synthesis may be

separated onto theirown DSP (orother front-end signal

processor)

Front-End Block:The front-end block is used for recognition and TTS functions

ProcessorBlock (Back-End):

The processor block performs all other code functions, includingtopic management and search


5/14

AUTOMATIC SPEECH RECOGNISATION

What does speaker

dependent / adaptive / independent mean?


6/14

What does continuous speech and isolated-word mean?

A continuous speech system operates on speech inwhich words are connected together, i.e. not separated

by pauses.

An isolated-word system operates on single words at a

time - requiring a pause between saying each word.

This is the simplest form of recognition

Continuous speech is more difficult to handle because of a variety

of effects.


7/14

The Process of Speech Recognition

Acoustic-Phonetic

Pattern Recognition

Artificial Intelligence

INTERFACE


8/14

The Experiment

Yes spoken by first person

Yes spoken by the second

person


9/14

Divide the sound wave into evenly spaced blocks.

Process each block for important characteristics .

Attempt to associate each block with a

Phone, which is the most basic unit of speech,

producing a string of phones.

Find the word whose model is the most likely match

The Basic Steps


10/14

speech recognition systems use the basic three-stage

Architecture:

Feature detection in which the

raw acoustic waveform is

represented in a more useful

space

Probabilistic classification of

the feature vectors, in which the

frames are scored as looking

more or less likely as versions

Search forbest word-

sequence hypothesis in which

a word sequence is found that is

consistent with the constraints of

lexicon and grammar


11/14

ADVANTAGES OF SPEECH RECOGNISATION

Easy search and index recorded audio and video data.

Speech recognition is also useful as a form of input.

people working in active environment such as hospitals to use computers.

people with handicaps to use computers.


12/14

CONCLUSION !!!

Visual cues to help computers decipher speech sounds that

are obscured by environmental noise.

Speech-to-speech translation project for spontaneous speech

Multi-engine Spanish-to-English machine translation system

Building synthetic voices


13/14


14/14

Thank YouThank You

cassi speech recognition

Documents