advances in wp2 nancy meeting – 6-7 july 2006

17
Advances in WP2 Nancy Meeting – 6-7 July 2006 www.loquendo.com

Post on 18-Dec-2015

214 views

Category:

Documents


0 download

TRANSCRIPT

Advances in WP2

Nancy Meeting – 6-7 July 2006

www.loquendo.com

2

Recent Work on NN Adaptation in WP2

• State of the art LIN adaptation method implemented and experimented on the benchmarks (m12)

• Innovative LHN adaptation method implemented and experimented on the benchmarks (m21)

• Experimental results on benchmark corpora and Hiwire database with LIN and LHN (m21)

• Further advances on new adaptation methods (m24)

3

LIN Adaptation

Output layer

….

….

….

Input layer

1st hidden layer

2nd hidden layer

Emission Probabilities

Acoustic phonetic Units

Speech Signal parameters….

Speaker

Independent

MLP

SI-MLP

LIN

4

LHN Adaptation

Output layer

….

….

….

Input layer

1st hidden layer

2nd hidden layer

Emission Probabilities

Acoustic phonetic Units

Speech Signal parameters….

Speaker

Independent

MLP

SI-MLP

LHN

5

Results Summary (W.E.R.)

Test set baseline LIN adapted

E.R. LHN adapted

E.R

WSJ0 16kHz bigr LM

10.5 9.4 10.5% 8.4 20.0%

WSJ1 Spoke-3

16kHz bigr LM

54.2 46.5 14.2% 30.6 43.5%

HIWIRE8kHz

11.6 7.8 32.1% 7.2 37.9%

6

Papers presented:

• Roberto Gemello, Franco Mana, Stefano Scanzio, Pietro Laface, Renato De Mori, “Adaptation of Hybrid ANN/HMM models using hidden linear transformations and conservative training”, Proc. of Icassp 2006, Toulouse, France, May 2006

• Dario Albesano, Roberto Gemello, Pietro Laface, Franco Mana, Stefano Scanzio, “Adaptation of Artificial Neural Networks Avoiding Catastrophic Forgetting”, Proc. of IJCNN 2006, Vancouver, Canada, July 2006

7

The “Forgetting” problem in ANN Adaptation

• It is well known, in connectionist learning, that acquiring new information in the adaptation process can damage previously learned information (Catastrophic Forgetting)

• This effect must be taken into account when adapting an ANN with limited amount of data, which do not include enough samples for all the classes.

• The “absent” classes may be forgotten during adaptation as the discriminative training (Error Backpropagation) assigns always zero targets to absent classes

8

“Forgetting” in ANN for ASR

• While Adapting ASR ANN/HMM model, this problem can arise when the adaptation set does not contain examples for some phonemes, due to the limited amount of adaptation data or the limited vocabulary

• The ANN training is discriminative, contrary to that of GMM-HMMs, and absent phonemes will be penalized by assigning to them a zero target during the adaptation

• That induces in the ANN a forgetting of the capability to classify the absent phonemes. Thus, while the HMM models for phonemes with no observations remain un-adapted, the ANN output units corresponding to phonemes with no observations loose their characterization, rather than staying not adapted

9

Example of ForgettingAdaptation examples only of E, U, O (e.g. from words: uno, due, tre); no examples for the other vowels (A, I, ə )The classes with examples adapt themselves, but tend to invade the classes with no examples, that are partially “forgotten”

F1 (kHz)

F2

(kHz)

E

e A

U O

0.00.5

5.0

4.0

3.0

2.0

1.0

1.51.00.5F1 (kHz)

F2

(kHz)

I E

e A

U O

0.00.5

5.0

4.0

3.0

2.0

1.0

1.51.00.5

I

10

“Conservative” Training

• We have introduced “conservative training” to avoid the forgetting of absent phonemes

• The idea is to avoid zero target for the absent phonemes, using for them the output of the Original NN as target;

Let be FP the set of phonemes present in the adaptation set and FA the set of absent ones. The target are assigned according to the following equations:

0

1

0

iPi

iPi

Ai

fcorrectFfTARGET

fcorrectFfTARGET

FfTARGETStandard policy

0

__1

__

iPi

FjjiPi

iAi

fcorrectFfTARGET

fNNORIGINALOUTPUTfcorrectFfTARGET

fNNORIGINALOUTPUTFfTARGET

A

Conservative policy

11

Conservative Training target assignment policy

A1 P1 P2 P3 A2

P2 is the class corresponding to the correct

phoneme

Px: class in the adaptation set Ax: absent

class

Posterior probability computed using the original network

0.03 0.00 0.95 0.00 0.02 0.00 0.00 1.00 0.00 0.00

Standard target assignment policy

12

“Conservative” Training

• In this way, the phonemes that are absent in the adaptation set are “represented” by the response given by the Original NN

• Thus, the absent phonemes are not “absorbed” by the neighboring present phonemes

• The results of adaptation with conservative training are:– Comparable performances on target environment– Preservation of performances on generalist environment– Great improvement of performances in speaker adaptation,

when only few sentences are available

13

Adaptation tasks

– Application data adaptation: Directory Assistance• 9325 Italian city names• 53713 training + 3917 test utterances

– Vocabulary adaptation: Command words• 30 command words• 6189 training + 3094 test utterances

– Channel-Environment adaptation: Aurora-3• 2951 training + 654 test utterances

14

Adaptation Results on different tasks (%WER)

Adaptation Task

Adaptation

Method

Application

Directory

Assistance

Vocabulary

Command Words

Channel-Environment

Aurora-3 CH1

No adaptation 14.6 3.8 24.0

LIN 11.2 3.4 11.0

LIN + CT 12.4 3.4 15.3

LHN 9.6 2.1 9.8

LHN + CT 10.1 2.3 10.4

15

Mitigation of Catastrophic Forgetting using Conservative Training

Models

Adapted on

Application

Directory

Assistance

Vocabulary

Command Words

Channel-Environment

Aurora-3 CH1

Adaptation

Method

LIN 36.3 42.7 108.6

LIN + CT 36.5 35.2 42.1

LHN 40.6 63.7 152.1

LHN + CT 40.7 45.3 44.2

No Adaptation 29.3

Tests using adapted models on Italian continuous speech (% WER)

16

Conclusions

– The new LHN adaptation method, developed within the project, outperforms standard LIN adaptation

– In adaptation tasks with missing classes, Conservative Training reduces the catastrophic forgetting effect, preserving the performance on another generic task

17

Workplan

• Selection of suitable benchmark databases (m6)

• Baseline set-up for the selected databases (m8)

• LIN adaptation method implemented and experimented on the

benchmarks (m12)

• Experimental results on Hiwire database with LIN (m18)

• Innovative NN adaptation methods and algorithms for acoustic

modeling and experimental results (m21)

• Further advances on new adaptation methods (m24)

• Unsupervised Adaptation: algorithms and experimentation (m33)