speech recognition principles

Speech Recognition Principles

Speech Recognition Concepts

2

NLPSpeech

Processing

Text Speech

NLPSpeech

Processing

SpeechUnderstanding

Speech Synthesis

TextPhone

Sequence

Speech Recognition

Speech recognition is the inverse of Speech Synthesis

Speech Recognition Approaches

Bottom-Up Approach

Top-Down Approach

Blackboard Approach

3

Bottom-Up Approach

4

Signal Processing

Feature Extraction

Segmentation

Signal Processing

Feature Extraction

Segmentation

Segmentation

Sound Classification Rules

Phonotactic Rules

Lexical Access

Language Model

Voiced/Unvoiced/SilenceK

now

ledge

Sourc

es

Recognized Utterance

Top-Down Approach

5

Recognized Utterance

Unit

Matching

System

Feature

Analysis

Lexical

Hypo

thesis

Syntactic

Hypo

thesis

Semantic

Hypo

thesis

Utterance

Verifier/

Matcher

Inventory

of speech

recognition

units

Word

DictionaryGrammar

Task

Model

Blackboard Approach

6

Environmental

Processes

Acoustic

Processes Lexical

Processes

Syntactic

Processes

Semantic

Processes

Black

board

7

An overall view of a speech recognition system

bottom up

top down

From Ladefoged 2001

Recognition TheoriesArticulatory Based Recognition

◦ Use from Articulatory system for recognition

◦ This theory is the most successful until now

Auditory Based Recognition◦ Use from Auditory system for recognition

Hybrid Based Recognition◦ Is a hybrid from the above theories

Motor Theory◦ Model the intended gesture of speaker

8

Recognition Problem

We have the sequence of acoustic symbols and we want to find the words expressed by speaker

Solution : Finding the most probable word sequence having Acoustic symbols

9

Recognition Problem

A : Acoustic Symbols

W : Word Sequence

we should find so that

10

w

)|(max)|ˆ( AwPAwPw

Simple Language Model

13

nwwwww 321

)|()( 1211

wwwwPwP iii

n

i

Computing this probability is very difficult and we need a very big database. So we use from Trigram and Bigram models.

Simple Language Model (Cont’d)

14

)|()( 211

iii

n

iwwwPwP

)|()( 11

ii

n

iwwPwP

Trigram :

Bigram :

)()(1

i

n

iwPwP

Monogram :

Simple Language Model (Cont’d)

15

)|( 123 wwwP

Computing Method :

Number of happening W3 after W1W2

Total number of happening W1W2

AdHoc Method :

)()|()|()|( 332321231123 wfwwfwwwfwwwP

16

From Ladefoged 2001

P(A|W) Computing Approaches

Dynamic Time Warping (DTW)

Hidden Markov Model (HMM)

Artificial Neural Network (ANN)

Hybrid Systems

17

Dynamic Time Warping Method (DTW)To obtain a global distance between two speech patterns a time alignment must be performed

18

Ex :

A time alignment

path between a

template pattern

“SPEECH” and a

noisy input

“SsPEEhH”

Recognition Tasks

Isolated Word Recognition (IWR) And Continuous Speech Recognition (CSR)

Speaker Dependent And Speaker Independent

Vocabulary Size◦ Small <20

◦ Medium >100 , <1000

◦ Large >1000, <10000

◦ Very Large >10000

19

Error Production Factor

Prosody (Recognition should be Prosody Independent)

Noise (Noise should be prevented)

Spontaneous Speech

20

Artificial Neural Network

21

.

.

.

1x

0x

1w0w

1Nw

1Nx

y)(

1

1

i

N

i

i xwy

Simple Computation Element

of a Neural Network

Artificial Neural Network (Cont’d)

Neural Network Types◦ Perceptron

◦ Time Delay

◦ Time Delay Neural Network Computational Element (TDNN)

22


23

. . .

. . .0x

0y 1My

1Nx

Single Layer Perceptron


24

. . .

. . .

Three Layer Perceptron

. . .

. . .

Hybrid MethodsHybrid Neural Network and Matched Filter For Recognition

25

PATTERN

CLASSIFIER

SpeechAcoustic

Features DelaysOutput Units

Neural Network Properties

The system is simple, But too much iterative

Doesn’t determine a specific structure

Regardless of simplicity, the results are good

Training size is large, so training should be offline

Accuracy is relatively good

26

Hidden Markov Model

Observation : O1,O2, . . .

States in time : q1, q2, . . .

All states : s1, s2, . . .

27

tOOOO ,,,, 321

tqqqq ,,,, 321

Si Sjjia

ija

speech recognition principles

Documents