scorpion : a new approach to design reliable real-time speech recognition systems f. vargas, r. d....
Post on 21-Dec-2015
213 views
TRANSCRIPT
SCORPION : A New Approach to Design Reliable
Real-Time Speech Recognition Systems
F. Vargas, R. D. Fagundes, D. Barros Jr.
Catholic University – PUCRS
Electrical Engineering Dept. Av. Ipiranga, 6681
90619-900 Porto Alegre
Brazil
Summary 1. Preliminary Considerations on the General Structure of Speech
Recognition Systems (SRS)
2. SCORPION: The Proposed Approach
2.1. Partitioning the SRS in HW and SW
2.2. Implementing Fault-Tolerance in the Parts of the SRS
2.2.1. Concurrent Consistency Check (CCC)
2.2.2. Transparent BIST
3. Experimental Results
4. Final Considerations & Future Work
DSP Components/Systems dedicated to Speech Recognition: increased use in nowadays applications: hanging from cellular phones, voice-oriented bank transactions, and security systems...
Increased demand for real-time response and reliability operations.
Signal Analysis Pattern Matching Logic Decision
Vector Quantization (VQ) Codebook * Feature Extraction
Speech*
Words
Reference Patterns (HMM Markov Models)
Fig. 1. General block diagram of speech recognition systems.
1.1. Preliminary Considerations on thePreliminary Considerations on the
General Structure of SpeechGeneral Structure of Speech Recognition Systems (SRS)Recognition Systems (SRS)
Fig. 2. (a) Signal Analysis & Conditioning Block,
(b) Pattern Matching & Logic Decision Block,
(c) implementation to recognize 3 words.
1.1. Preliminary Considerations on thePreliminary Considerations on the
General Structure of SpeechGeneral Structure of Speech Recognition Systems (SRS)Recognition Systems (SRS)
LPC/Cepstral Analysis
Low Pass Filter
Vector Codebook
SamplingSpeech Sound
Pre-Emphasis Filter
Windowing
VQ
Observation Sequences (to the Pattern Matching Block)
Logic Decision
Local Cache Memory 1
Observation Sequences (from the Signal Anaysis Block)
HMM (word 1)
word recognition
Local Cache Memory 2
HMM (word 2)
Local Cache Memory 3
HMM (word 3)
(a) (b) (c)
2. 2. SCORPION: The Proposed ApproachSCORPION: The Proposed Approach
2.1. 2.1. Partitioning the SRS in HW and SWPartitioning the SRS in HW and SW
Specific SRS Dataflow:
- Low Complexity, high volume parallel additions followed by xor bit-a-bit operations to perform pattern matching and logic decision.
- High Complexity, high volume sequence of digital filtering operations to adjust frequency variations during the signal analysis and conditioning procedure.
Pattern Matching & Logic Decision Block :
Almost no data dependency HW Part
Signal Analysis & Conditioning Block :
Data are strongly dependent one to each other SW Part
2. 2. SCORPION: The Proposed ApproachSCORPION: The Proposed Approach
2.1. 2.1. Partitioning the SRS in HW and SWPartitioning the SRS in HW and SW
Observation Sequences
word recognition
Signal Analysis & Conditioning Block
Pattern Matching & Logic Decision Block
Speech Sound
HW Part
SW Part
SRS
Fig. 3. SRS main blocks, after
partitioning into SW and
HW parts.
FT in SW :
- well-known by the DSP-Community;
- in general, high degree of success.
FT in HW :
- frequent overflow occurrence in MAC operations;
- confidence of large amount of reference data stored in
memories (codebooks and probability estimations)
2. 2. SCORPION: The Proposed ApproachSCORPION: The Proposed Approach
2.2. 2.2. Implementing Fault Tolerance inImplementing Fault Tolerance in
the Parts of the SRSthe Parts of the SRS
Key-point : maintain the relative distance between the partial results
stored in the respective HMM accumulators (fig. 2b) in order to select the
higher score in the “Logic Decision Step”.
CCC : performs a 1-bit shift right in the contents of the HMM
accumulators after every MAC operation.
2. 2. SCORPION: The Proposed ApproachSCORPION: The Proposed Approach
2.2. 2.2. Implementing Fault Tolerance in the PartsImplementing Fault Tolerance in the Parts
of the SRSof the SRS
2.2.1.2.2.1. Concurrent Consistency Check (CCC) Concurrent Consistency Check (CCC)
2. 2. SCORPION: The Proposed ApproachSCORPION: The Proposed Approach
2.2. 2.2. Implementing Fault Tolerance in the PartsImplementing Fault Tolerance in the Parts
of the SRSof the SRS
2.2.1.2.2.1. Transparent BIST Transparent BIST
Key-point : perform concurrent checking of large amounts of memory
space while maintaining the real-time response requested for these type of
DSP systems.
Transparent BIST : minimize the periodical “down times” required to check
the functionalities of the local memories associated with each of the HMMs.
3. 3. Experimental ResultsExperimental Results
Case Study :
Implemented and trained an SRS to recognize 2 words.
System Description(MatLab)
HW Part(Pattern
Matching & Logic Decision
Block)
SW Part(Signal
Analysis &Conditioning
Block)
Reliability Functions : - CCC - Transp. BIST
System Description(MatLab)
Fully-SWImplementation
(a) (b) (c)
(a) SRS performance improvement due to the system partitioning according to the proposed HW-SW codesign technique.
3. 3. Experimental ResultsExperimental Results
Improvement 9 times (Traditional SW-Based Approach)
(b) SRS performance degradation due to the inclusion of the transp. BIST into the HW part. (The concurrent consistency check (CCC) is performed in parallel with the application, thus not resulting in performance penalty.)
SystemPerformance (s)
[time required torecognize a word]
PerformanceImprovement
Traditional SW-based microprocessor implementation 0.577 ---
HW-SW based partitioning approach (original HWimplementation without redundancy)
0.00132 437 times
SystemPerformance (s)
[time required torecognize a word]
PerformanceDegradation (%)
Original HW implementation (without redundancy) 0.00132 ---
HW implementation including the transparent BIST * 0.06132 46 times
* Standalone runtime for the transparent BIST through the local cache memories of the HMM blocks: 60ms.
3. 3. Experimental ResultsExperimental Results
(c) Area overhead required by the different implementation forms of the fault tolerant Pattern Matching & Logic Diagram Block.
2 words: worst-case 20.99%.
Area to implement the Transp BIST approx. constant (~ 190 CLB).
Increasing the # of words add proportionally local cache mem. to the Pattern Matching & Logic Decision Block (parallel architecture!).
Conclusion: for increased-vocabulary SRS (4, 8, 16, 32, 64 words), Transp. BIST 11%, 5.5%, 3%, 1.5%, 0.8% , respectively.
Area(configurablelogic blocks
– CLBs)
Area Overhead(%)
Original HW implementation (withoutredundancy)
905 ---
HW implementation including theconcurrent consistency check (CCC)
2 0.22
HW implementation including thetransparent BIST
190 20.99
HW implementation including bothCCC and transparent BIST
192 21.21
3. 3. Experimental ResultsExperimental Results
(c) SRS reliability (word recognition confidence) due to the inclusion of the concurrent consistency check (CCC) technique after each MAC operation in the pattern recognition & logic decision block.
System confidence
[frequency of which wordsare recognized correctly]
Traditional SW-based microprocessor implementation approx. 90%
HW-SW based partitioning approach (original HWimplementation without redundancy)
approx. 40%
HW implementation including the concurrentconsistency check (CCC)
approx. 90%
System confidence degradation
undesired environment noise present during input evaluation (speech capture as well as during the training period of the SRS).
4. 4. Final ConclusionsFinal Conclusions
We have presented a new approach that :
considers the specific characteristics of DSP algorithms (SRS)
to couple HW/SW partitioning and redundancy techniques.
adds reliability while improving real-time requirements.
The proposed techniques :
Concurrent Consistency Check (CCC): implements part of the SRS in HW by reducing the # of overflows in arithmetic op. while minimizing area overhead (based on 8-bit length words).
Transparent BIST: performs checking of faults affecting the reference data stored in the large local memories associated with the HMMs while preserving the real-time response needs of SRSs.
4. 4. Future WorkFuture Work
Implement SRSs with: larger vocabulary (at least 32 words).
Improve confidence by enlarging from 66 to 132 observation sequences per word.
Propose a new on-line technique to minimize the noise affecting speech.