speech quality assessment for wideband communication...
TRANSCRIPT
Speech Quality Assessment for Wideband Communication Scenarios
H. W. Gierlich, S. Völl, F. Kettler(HEAD acoustics GmbH)
P. Jax(IND, RWTH Aachen)
� Workshop on Wideband Speech Quality in Terminals and Networks
supported by:
Outline
� General aspects of speech quality in wideband systems
� Subjective evaluationsConversational testsSpeech intelligibilityBackground noise transmissionEcho tests
� Summary
speechquality
talkingsituation
listeningsituation
conversationalsituation
Speech Quality Parameters
� from the user�s perspective
Auditory Parameters�contributing to speech quality:
(Speech) sound quality
Quality of background noise transmission
Delay and echo
Double talk capability
Switching and echosingle talk/double talk
Loudness
(System) noise
Narrow band Wide bandDifferent quality
perception
Loudness (WB)
Different quality Perception ?
Different quality Perception ?
Double talk capability
Different quality Perception ?
Different quality Perception
Roadmap for the development of objective measurements
1. Conversational tests ! parameter identification (qualitative)
2. Listening-only tests ! quantitative judgement 3. Development of objective measurement
methods to reproduce the results of the LOT
! Quality evaluation of wideband systems without subjective tests
Outline
� General aspects of speech quality in wideband systems
� Subjective evaluationsConversational testsSpeech intelligibilityBackground noise transmissionEcho tests
� Summary
Conversational Tests
� Purpose: identification of parameters characterizing the communicational quality in wideband systems
� Test conditions:Experts tests�Kandisky�-test4 wideband codecs under test3 conditions for each codec: �normal� conversation, with music in office room, with babble in office room�free answering�
Setup and test procedure
� Setup:
� Test procedure:Different codecs includedOne echo-canceller for all tests
�normal office�
PC for
BGN
HFT 1
EC&CC
G.722
Sound-proof cabinetHFT 2
EC off
G.722
CodecA, B, C
Results
� Sorting the comments in categories:Speech sound quality, echo behavior, Quality of background noise, others (e.g. noise, clipping)
� Example: speech sound quality
rattles, crackles, blunt sound, ...BWE
high dynamic, hollow, clank, ...AMR-WB2
sounds rough, naturally, distorted, ...AMR-WB
sounds naturally, high dynamic ...G.722commentscodec
Conclusion
� Relevant parameters to be studied further:sound of speechEcho: level, masking, intelligibilityquality of background noise transmissionNoiseDouble talkSwitching/clipping
� Design of listening-only tests concerningspeech intelligibility: narrow band vs. wide bandquality of transmitted background noise annoyance of echo
Outline
� General aspects of speech quality in wideband systems
� Subjective evaluationsConversational testsSpeech intelligibilityBackground noise transmissionEcho tests
� Summary
Speech intelligibility test
� Sensitive test: logatom-test � Consonant � vowel � consonant� Informal test:
3x 12 test persons, 29 logatoms
� Test persons note the �word� they understood
Recording & Listening
� Recording:
� Listening:test persons listen to the artificial head recordings
�normal office� Sound-proof cabinetPCM /
AMR-WB /
ISDN
PC
0
20
40
60
80
100
ISDN PCM 16 bit AMR 12,65kbit/s
codec
% c
orre
ct lo
gato
ms
Results
� Increased intelligibilityfor wideband codecs
14 % ≅ 4 logatoms
Outline
� General aspects of speech quality in wideband systems
� Subjective evaluationsConversational testsSpeech intelligibilityBackground noise transmissionEcho tests
� Summary
Background noise assessment: music
� Background noise: additional informationabout talkers environment
� Tests with untrained persons: assessment on a 5 point MOS scaleexperts: assessment on a 5 point MOS scale and giving reasons why
� 16 different codecs under test
Background noise tests
� Recording listening samples:
� Listening:test persons listen to artificial head recordings
� ACR scale: excellent � good � fair � poor � bad
�normal office�
PC for
BGN
HFT 1sound-proof cabinet
HFT 2codecor filter
PC
Quality of transmitted background music
1
2
3
4
5
PCM
G722
MP3G.72
2.1G.72
2-HP
ISDN
G.722-
NBBW
E-U
AMR8G.72
2-TP
BWE-O
BWE-U
+O
AMR-2
IND0
AMR2CN
AMR-0
Codec
MO
S-LQ
S
Results
� 3 quality levels with significantly different MOS - values
- wideband- good �intelligibility�
of music
- narrowband- good �intelligibility�
of music
- mostly wideband- bad �intelligibility�
of music
Outline
� General aspects of speech quality in wideband systems
� Subjective evaluationsConversational testsSpeech intelligibilityBackground noise transmissionEcho tests
� Summary
Echo annoyance test
� Using hands-free telephones ! echo disturbances a dominant problem
� Investigation of the annoying aspects of echo using wide-band links:
influence of echo sound, influence of echo level,influence of codec, ...
� Mean one-way transmission time constantfor all listening samples: 170 ms
Echo annoyance tests
� Recording:
� Listening:test persons listen to the artificial head recordings: ! direct speech + echo
�normal office�
HFT 1
EC offCC off
sound-proof cabinet
HFT 2
EC on CC on
codecor filter
PCecho
Tests & assessment
� Tests with untrained persons: assessment on a 5 point MOS scaleexperts: assessment on a 5 point MOS scaleand giving reasons why
� DCR scale:5 � echo is inaudible4 � echo is audible, but not annoying3 � echo is slightly annoying2 � echo is annoying1 � echo is very annoying
Echo levels
� TCLw acc. ITU-T P.79
� Note: hands-free on boths sides, SLR = 7dB, RLR = 5dB (including HFT correction of 14 dB) => TELR(max) = 39dB
! Investigation of codec and echo level
TCLw = 27 dB ! �low� echo level21 dB ! �medium� echo level13 dB ! �high� echo level
�normal office�HFT 1
EC offCC off
echoRLR
Results
� Comparision of annoyance by echo level (experts)
!Differences for echos with the same echo level!Echo masked by direct speech
Echo annoyance
1
2
3
4
5
PCM G.722.1 MP3 G722 AMR, 12kbit/s
AMR WB2, 12kbit/s
BWE ISDN
codec
MO
S-LQ
S
high echo levelmedium echo levellow echo level
Influence of the bandwidth� Filtering of the echo signal
� Speech modulated noise (two examples)� Level adjustment to TCLw = ! �medium� echo
level
Filter
-30
-20
-10
0
10
20
30
10 100 1000 10000
widebandwideband f>300 Hzwideband f<3400narrow bandnarrow band lv. Adjlow freq. increasedlow freq. decreasedhigh freq. increasedhigh freq. strongly increased
-> f/Hz
dB
echo annoyance: dependency on bandwidth
1
2
3
4
5
wide-band
wide-band
f > 30
0 Hz
wide-band
f < 34
00 H
zna
rrow-ba
nd
narro
w-band
lv. a
dj.
low fre
q. incre
ased
low fre
q. dec
rease
d
high fre
q. incre
ased
high fre
q. stro
ngly
increa
sed
echo
= mod
ulated w
ith nois
eec
ho =
noise
Variante
MO
S-LQ
S
traineduntrained
Results
significant differences between experts and untrained test persons
Results echo annoyance
� High frequencies !!!! very annoying� Wide-band and low freq. !!!! slightly annoying� Experts more critical than untrained test
persons� Speech modulated noisy echo:
Experts: very annoying ! no advantagesUntrained: felt insecure
Summary (1)
� Background noise transmission - relevant aspects:Bandwidth�intelligibility� / brightness / low distortion (small difference to theoriginal)
� Echo annoyance - relevant aspects:LevelMasking propertiesDistortion and frequency characteristics
� Additional parameters to be investigated subjectively:
NoiseSwitching/clipping Double talk behavior