scorpion : a new approach to design reliable real-time speech recognition systems f. vargas, r. d....

15
SCORPION : A New Approach to Design Reliable Real-Time Speech Recognition Systems F. Vargas, R. D. Fagundes, D. Barros Jr. [email protected] Catholic University – PUCRS Electrical Engineering Dept. Av. Ipiranga, 6681 90619-900 Porto Alegre

Post on 21-Dec-2015

213 views

Category:

Documents


0 download

TRANSCRIPT

SCORPION : A New Approach to Design Reliable

Real-Time Speech Recognition Systems

F. Vargas, R. D. Fagundes, D. Barros Jr.

[email protected]

Catholic University – PUCRS

Electrical Engineering Dept. Av. Ipiranga, 6681

90619-900 Porto Alegre

Brazil

[email protected] 2

Summary 1. Preliminary Considerations on the General Structure of Speech

Recognition Systems (SRS)

2. SCORPION: The Proposed Approach

2.1. Partitioning the SRS in HW and SW

2.2. Implementing Fault-Tolerance in the Parts of the SRS

2.2.1. Concurrent Consistency Check (CCC)

2.2.2. Transparent BIST

3. Experimental Results

4. Final Considerations & Future Work

[email protected] 3

DSP Components/Systems dedicated to Speech Recognition: increased use in nowadays applications: hanging from cellular phones, voice-oriented bank transactions, and security systems...

Increased demand for real-time response and reliability operations.

Signal Analysis Pattern Matching Logic Decision

Vector Quantization (VQ) Codebook * Feature Extraction

Speech*

Words

Reference Patterns (HMM Markov Models)

Fig. 1. General block diagram of speech recognition systems.

1.1. Preliminary Considerations on thePreliminary Considerations on the

General Structure of SpeechGeneral Structure of Speech Recognition Systems (SRS)Recognition Systems (SRS)

[email protected] 4

Fig. 2. (a) Signal Analysis & Conditioning Block,

(b) Pattern Matching & Logic Decision Block,

(c) implementation to recognize 3 words.

1.1. Preliminary Considerations on thePreliminary Considerations on the

General Structure of SpeechGeneral Structure of Speech Recognition Systems (SRS)Recognition Systems (SRS)

LPC/Cepstral Analysis

Low Pass Filter

Vector Codebook

SamplingSpeech Sound

Pre-Emphasis Filter

Windowing

VQ

Observation Sequences (to the Pattern Matching Block)

Logic Decision

Local Cache Memory 1

Observation Sequences (from the Signal Anaysis Block)

HMM (word 1)

word recognition

Local Cache Memory 2

HMM (word 2)

Local Cache Memory 3

HMM (word 3)

(a) (b) (c)

[email protected] 5

2. 2. SCORPION: The Proposed ApproachSCORPION: The Proposed Approach

2.1. 2.1. Partitioning the SRS in HW and SWPartitioning the SRS in HW and SW

Specific SRS Dataflow:

- Low Complexity, high volume parallel additions followed by xor bit-a-bit operations to perform pattern matching and logic decision.

- High Complexity, high volume sequence of digital filtering operations to adjust frequency variations during the signal analysis and conditioning procedure.

[email protected] 6

Pattern Matching & Logic Decision Block :

Almost no data dependency HW Part

Signal Analysis & Conditioning Block :

Data are strongly dependent one to each other SW Part

2. 2. SCORPION: The Proposed ApproachSCORPION: The Proposed Approach

2.1. 2.1. Partitioning the SRS in HW and SWPartitioning the SRS in HW and SW

Observation Sequences

word recognition

Signal Analysis & Conditioning Block

Pattern Matching & Logic Decision Block

Speech Sound

HW Part

SW Part

SRS

Fig. 3. SRS main blocks, after

partitioning into SW and

HW parts.

[email protected] 7

FT in SW :

- well-known by the DSP-Community;

- in general, high degree of success.

FT in HW :

- frequent overflow occurrence in MAC operations;

- confidence of large amount of reference data stored in

memories (codebooks and probability estimations)

2. 2. SCORPION: The Proposed ApproachSCORPION: The Proposed Approach

2.2. 2.2. Implementing Fault Tolerance inImplementing Fault Tolerance in

the Parts of the SRSthe Parts of the SRS

[email protected] 8

Key-point : maintain the relative distance between the partial results

stored in the respective HMM accumulators (fig. 2b) in order to select the

higher score in the “Logic Decision Step”.

CCC : performs a 1-bit shift right in the contents of the HMM

accumulators after every MAC operation.

2. 2. SCORPION: The Proposed ApproachSCORPION: The Proposed Approach

2.2. 2.2. Implementing Fault Tolerance in the PartsImplementing Fault Tolerance in the Parts

of the SRSof the SRS

2.2.1.2.2.1. Concurrent Consistency Check (CCC) Concurrent Consistency Check (CCC)

[email protected] 9

2. 2. SCORPION: The Proposed ApproachSCORPION: The Proposed Approach

2.2. 2.2. Implementing Fault Tolerance in the PartsImplementing Fault Tolerance in the Parts

of the SRSof the SRS

2.2.1.2.2.1. Transparent BIST Transparent BIST

Key-point : perform concurrent checking of large amounts of memory

space while maintaining the real-time response requested for these type of

DSP systems.

Transparent BIST : minimize the periodical “down times” required to check

the functionalities of the local memories associated with each of the HMMs.

[email protected] 10

3. 3. Experimental ResultsExperimental Results

Case Study :

Implemented and trained an SRS to recognize 2 words.

System Description(MatLab)

HW Part(Pattern

Matching & Logic Decision

Block)

SW Part(Signal

Analysis &Conditioning

Block)

Reliability Functions : - CCC - Transp. BIST

System Description(MatLab)

Fully-SWImplementation

(a) (b) (c)

[email protected] 11

(a) SRS performance improvement due to the system partitioning according to the proposed HW-SW codesign technique.

3. 3. Experimental ResultsExperimental Results

Improvement 9 times (Traditional SW-Based Approach)

(b) SRS performance degradation due to the inclusion of the transp. BIST into the HW part. (The concurrent consistency check (CCC) is performed in parallel with the application, thus not resulting in performance penalty.)

SystemPerformance (s)

[time required torecognize a word]

PerformanceImprovement

Traditional SW-based microprocessor implementation 0.577 ---

HW-SW based partitioning approach (original HWimplementation without redundancy)

0.00132 437 times

SystemPerformance (s)

[time required torecognize a word]

PerformanceDegradation (%)

Original HW implementation (without redundancy) 0.00132 ---

HW implementation including the transparent BIST * 0.06132 46 times

* Standalone runtime for the transparent BIST through the local cache memories of the HMM blocks: 60ms.

[email protected] 12

3. 3. Experimental ResultsExperimental Results

(c) Area overhead required by the different implementation forms of the fault tolerant Pattern Matching & Logic Diagram Block.

2 words: worst-case 20.99%.

Area to implement the Transp BIST approx. constant (~ 190 CLB).

Increasing the # of words add proportionally local cache mem. to the Pattern Matching & Logic Decision Block (parallel architecture!).

Conclusion: for increased-vocabulary SRS (4, 8, 16, 32, 64 words), Transp. BIST 11%, 5.5%, 3%, 1.5%, 0.8% , respectively.

Area(configurablelogic blocks

– CLBs)

Area Overhead(%)

Original HW implementation (withoutredundancy)

905 ---

HW implementation including theconcurrent consistency check (CCC)

2 0.22

HW implementation including thetransparent BIST

190 20.99

HW implementation including bothCCC and transparent BIST

192 21.21

[email protected] 13

3. 3. Experimental ResultsExperimental Results

(c) SRS reliability (word recognition confidence) due to the inclusion of the concurrent consistency check (CCC) technique after each MAC operation in the pattern recognition & logic decision block.

System confidence

[frequency of which wordsare recognized correctly]

Traditional SW-based microprocessor implementation approx. 90%

HW-SW based partitioning approach (original HWimplementation without redundancy)

approx. 40%

HW implementation including the concurrentconsistency check (CCC)

approx. 90%

System confidence degradation

undesired environment noise present during input evaluation (speech capture as well as during the training period of the SRS).

[email protected] 14

4. 4. Final ConclusionsFinal Conclusions

We have presented a new approach that :

considers the specific characteristics of DSP algorithms (SRS)

to couple HW/SW partitioning and redundancy techniques.

adds reliability while improving real-time requirements.

The proposed techniques :

Concurrent Consistency Check (CCC): implements part of the SRS in HW by reducing the # of overflows in arithmetic op. while minimizing area overhead (based on 8-bit length words).

Transparent BIST: performs checking of faults affecting the reference data stored in the large local memories associated with the HMMs while preserving the real-time response needs of SRSs.

[email protected] 15

4. 4. Future WorkFuture Work

Implement SRSs with: larger vocabulary (at least 32 words).

Improve confidence by enlarging from 66 to 132 observation sequences per word.

Propose a new on-line technique to minimize the noise affecting speech.