methods for low bitrate coding enhancement part ii: spatial … · 2017. 9. 22. · stereo with...
TRANSCRIPT
![Page 1: Methods for Low Bitrate Coding Enhancement Part II: Spatial … · 2017. 9. 22. · Stereo With Enhancement (SWE), Ambient Sound Enhancement (ASE), SWE + ASE. 5test signals of length](https://reader033.vdocuments.site/reader033/viewer/2022052000/6012d29bce02d15e58712005/html5/thumbnails/1.jpg)
© Fraunhofer IIS 1 08.09.2017
Methods for Low Bitrate Coding Enhancement Part II: Spatial Enhancement
Christian Uhle1,2, Patrick Gampp1, Oliver Hellmuth1, Peter Prokein1,
Jürgen Herre2,1, Sascha Disch1,2, Julia Havenstein1, Antonios Karampourniotis1
1 Fraunhofer IIS, Erlangen, Germany2 International Audio Laboratories Erlangen, Germany
![Page 2: Methods for Low Bitrate Coding Enhancement Part II: Spatial … · 2017. 9. 22. · Stereo With Enhancement (SWE), Ambient Sound Enhancement (ASE), SWE + ASE. 5test signals of length](https://reader033.vdocuments.site/reader033/viewer/2022052000/6012d29bce02d15e58712005/html5/thumbnails/2.jpg)
© Fraunhofer IIS 2 08.09.2017
Methods for Low Bitrate Coding Enhancement Part II: Spatial Enhancement
1. Introduction
2. System Overview
3. Ambient Sound Enhancement
4. Stereo Width Enhancement
5. Evaluation
6. Conclusion
![Page 3: Methods for Low Bitrate Coding Enhancement Part II: Spatial … · 2017. 9. 22. · Stereo With Enhancement (SWE), Ambient Sound Enhancement (ASE), SWE + ASE. 5test signals of length](https://reader033.vdocuments.site/reader033/viewer/2022052000/6012d29bce02d15e58712005/html5/thumbnails/3.jpg)
© Fraunhofer IIS 3 08.09.2017
1. IntroductionMotivation
Perceptual Audio Coding (PAC) is applied for storage and transmission of audio signals.
Perceptual transparency is achieved when bitrate is high enough. Original and coded/decoded signals are indistinguishable when listening in an
optimal listening environment. At low bitrates, artifacts can be introduced and the sound quality is reduced. Width of stereo image is reduced, e.g. due to Decreased difference signal (M/S Coding), Increased correlation between channel signals (Intensity Stereo Coding).
![Page 4: Methods for Low Bitrate Coding Enhancement Part II: Spatial … · 2017. 9. 22. · Stereo With Enhancement (SWE), Ambient Sound Enhancement (ASE), SWE + ASE. 5test signals of length](https://reader033.vdocuments.site/reader033/viewer/2022052000/6012d29bce02d15e58712005/html5/thumbnails/4.jpg)
© Fraunhofer IIS 4 08.09.2017
1. IntroductionMotivation
Aim is to apply post‐processing for improving the sound quality. Single‐ended, i.e. without having information about the coding (codec, bit rate). Criterion is pleasantness, not transparency.
![Page 5: Methods for Low Bitrate Coding Enhancement Part II: Spatial … · 2017. 9. 22. · Stereo With Enhancement (SWE), Ambient Sound Enhancement (ASE), SWE + ASE. 5test signals of length](https://reader033.vdocuments.site/reader033/viewer/2022052000/6012d29bce02d15e58712005/html5/thumbnails/5.jpg)
© Fraunhofer IIS 5 08.09.2017
2. System OverviewIntegration into the Automotive Sound System
Audio Decoder
Car Head Unit
(Degraded)PCM Audio
SignalAudio Source Manager
(Enhanced) PCM Audio
Signal
)))
)))
)))
))))))
(Compressed) Audio
Bitstream
PCM Loudspeaker
Signal
Car Amplifier
Car Sound Processing
Low Bitrate Coding Enhancement Suite
Spectral Restoration
Spatial Enhancement
...
Audio Decoder
Car Head Unit
(Degraded)PCM Audio
SignalX’Audio
Source Manager
Low Bitrate Coding Enhancement Suite
Spectral Restoration
Spatial Enhancement
(Enhanced) PCM Audio
Signal
)))
)))
)))
))))))
(Compressed) Audio
Bitstream
PCM Loudspeaker
Signal
...
Car Amplifier
Car Sound Processing
Operation in the Head Unit
Operation in the Amplifier
![Page 6: Methods for Low Bitrate Coding Enhancement Part II: Spatial … · 2017. 9. 22. · Stereo With Enhancement (SWE), Ambient Sound Enhancement (ASE), SWE + ASE. 5test signals of length](https://reader033.vdocuments.site/reader033/viewer/2022052000/6012d29bce02d15e58712005/html5/thumbnails/6.jpg)
© Fraunhofer IIS 6 08.09.2017
3. Ambient Sound EnhancementOverview
Improve the perceived stereo image by applying artificial decorrelation to the background signal components.
Background sounds: ambient sounds, background music (radio broadcast) and musical accompaniment.
Foreground sounds: singers, talkers, soloists, loud instruments (drums). Maintains the timbral qualities without introducing coloration and artifacts. Decorrelation can impair the sound quality when applied to foreground sounds
(e.g. speech, drums). Decorrelation is not required for foreground sounds (directional sounds are
locatable). The intensity of the decorrelation is controlled using a model of reverberance
(perceptual attribute that relates to the intensity of reverberation).
![Page 7: Methods for Low Bitrate Coding Enhancement Part II: Spatial … · 2017. 9. 22. · Stereo With Enhancement (SWE), Ambient Sound Enhancement (ASE), SWE + ASE. 5test signals of length](https://reader033.vdocuments.site/reader033/viewer/2022052000/6012d29bce02d15e58712005/html5/thumbnails/7.jpg)
© Fraunhofer IIS 7 08.09.2017
3. Ambient Sound EnhancementBlock Diagram
![Page 8: Methods for Low Bitrate Coding Enhancement Part II: Spatial … · 2017. 9. 22. · Stereo With Enhancement (SWE), Ambient Sound Enhancement (ASE), SWE + ASE. 5test signals of length](https://reader033.vdocuments.site/reader033/viewer/2022052000/6012d29bce02d15e58712005/html5/thumbnails/8.jpg)
© Fraunhofer IIS 8 08.09.2017
3. Ambient Sound EnhancementBlock Diagram
Background sounds are separated by attenuating transient and tonal signals.
![Page 9: Methods for Low Bitrate Coding Enhancement Part II: Spatial … · 2017. 9. 22. · Stereo With Enhancement (SWE), Ambient Sound Enhancement (ASE), SWE + ASE. 5test signals of length](https://reader033.vdocuments.site/reader033/viewer/2022052000/6012d29bce02d15e58712005/html5/thumbnails/9.jpg)
© Fraunhofer IIS 9 08.09.2017
3. Ambient Sound EnhancementSeparation of the Background Sounds
STFT, Spectral weighting, i.e. scaling of the spectral coefficients, Spectral weights (for each time‐frequency bin) to attenuate transient signal
components, Spectral weights for attenuating tonal signal components, Combination of these spectral weights (by taking the minimum of both), Inverse STFT.
![Page 10: Methods for Low Bitrate Coding Enhancement Part II: Spatial … · 2017. 9. 22. · Stereo With Enhancement (SWE), Ambient Sound Enhancement (ASE), SWE + ASE. 5test signals of length](https://reader033.vdocuments.site/reader033/viewer/2022052000/6012d29bce02d15e58712005/html5/thumbnails/10.jpg)
© Fraunhofer IIS 10 08.09.2017
3. Ambient Sound EnhancementAttenuation of Transient Signals
Signal model: Input signal is an additive mixture of a transient signal component and a sustained signal component (in the STFT domain, time frame index k and frequency bin index m):
The transient signal is attenuated by spectral weighting
The spectral weights are computed from estimates of the sustained signal and the transient signal
The sustained signal magnitude is estimated by means of low‐pass filtering of the sub‐band magnitudes along time and limiting the sustained signal by the input.
|Ytrns(k,m)| = Gtrns(k,m)|X(k,m)|
Gtrns =|X̂s| + |X̂t |
|X |
|X(k,m)| = |Xt(k,m)| + |Xs(k,m)|
![Page 11: Methods for Low Bitrate Coding Enhancement Part II: Spatial … · 2017. 9. 22. · Stereo With Enhancement (SWE), Ambient Sound Enhancement (ASE), SWE + ASE. 5test signals of length](https://reader033.vdocuments.site/reader033/viewer/2022052000/6012d29bce02d15e58712005/html5/thumbnails/11.jpg)
© Fraunhofer IIS 11 08.09.2017
3. Ambient Sound EnhancementAttenuation of Transient Signals (2)
Sound example: Input signal (black) overlaid by output signal (red)
Time [s]0 1 2 3 4 5
Ampl
itude
-0.3
-0.2
-0.1
0
0.1
0.2
0.3
0.4InputOutput
![Page 12: Methods for Low Bitrate Coding Enhancement Part II: Spatial … · 2017. 9. 22. · Stereo With Enhancement (SWE), Ambient Sound Enhancement (ASE), SWE + ASE. 5test signals of length](https://reader033.vdocuments.site/reader033/viewer/2022052000/6012d29bce02d15e58712005/html5/thumbnails/12.jpg)
© Fraunhofer IIS 12 08.09.2017
3. Ambient Sound EnhancementAttenuation of Tonal Signals
Attenuate spectral components that exceed an estimate of the noise floor, i.e. a locally flat magnitude spectrum.
![Page 13: Methods for Low Bitrate Coding Enhancement Part II: Spatial … · 2017. 9. 22. · Stereo With Enhancement (SWE), Ambient Sound Enhancement (ASE), SWE + ASE. 5test signals of length](https://reader033.vdocuments.site/reader033/viewer/2022052000/6012d29bce02d15e58712005/html5/thumbnails/13.jpg)
© Fraunhofer IIS 13 08.09.2017
3. Ambient Sound EnhancementDecorrelation of Background Sounds
Linear time‐invariant processing in the time domain with a dense and short impulse response.
Decorrelation filter structure is a trade‐off between sound quality and complexity (computational load, memory requirements and tuning effort).
Here: 3 nested all‐pass filters in parallel per output channel. The tuning of the parameters (delays and gains of the all‐pass filters) is of crucial
importance.
![Page 14: Methods for Low Bitrate Coding Enhancement Part II: Spatial … · 2017. 9. 22. · Stereo With Enhancement (SWE), Ambient Sound Enhancement (ASE), SWE + ASE. 5test signals of length](https://reader033.vdocuments.site/reader033/viewer/2022052000/6012d29bce02d15e58712005/html5/thumbnails/14.jpg)
© Fraunhofer IIS 14 08.09.2017
3. Ambient Sound EnhancementDecorrelator Gain Control
The perceived level of decorrelation (and reverberation) depends on both, the processing (impulse response) and the input signal. Lower effect intensity for stationary input signals than for transient signals or
frequency modulated signals (e.g. speech). Level of decorrelation is controlled using a model for the perceived intensity of
decorrelation. Modified version of a model of reverberance (Uhle et. al., 2011), Based on a model for partial loudness (Moore et. al., 1997). Partial loudness difference =
partial loudness of decorrelated signal (masked by the dry input)‐ partial loudness of dry input (masked by the decorrelated signal)
![Page 15: Methods for Low Bitrate Coding Enhancement Part II: Spatial … · 2017. 9. 22. · Stereo With Enhancement (SWE), Ambient Sound Enhancement (ASE), SWE + ASE. 5test signals of length](https://reader033.vdocuments.site/reader033/viewer/2022052000/6012d29bce02d15e58712005/html5/thumbnails/15.jpg)
© Fraunhofer IIS 15 08.09.2017
4. Stereo Width EnhancementOverview
Extending the width of the stereo image by enhancing inter‐channel level differences of direct sound components:1. Stereo Mid/Side Decomposition,2. Boost the stereo side signal.
STFT Stereo M/S- Decomposition iSTFT
x(t)X(k, m)
y(t)
S(k, m)
M(k, m)
w
Y(k, m)
![Page 16: Methods for Low Bitrate Coding Enhancement Part II: Spatial … · 2017. 9. 22. · Stereo With Enhancement (SWE), Ambient Sound Enhancement (ASE), SWE + ASE. 5test signals of length](https://reader033.vdocuments.site/reader033/viewer/2022052000/6012d29bce02d15e58712005/html5/thumbnails/16.jpg)
© Fraunhofer IIS 16 08.09.2017
4. Stereo Width EnhancementStereo Mid/Side Decomposition
Stereo side signal: S1 = G1X1
S2 = G2X2
Gi = max(0,|Xi |α − κ |D|α
|Xi |α)
β
• D: Downmix of the input signal• Tuning parameters for controlling the attenuation
with spectral weights
![Page 17: Methods for Low Bitrate Coding Enhancement Part II: Spatial … · 2017. 9. 22. · Stereo With Enhancement (SWE), Ambient Sound Enhancement (ASE), SWE + ASE. 5test signals of length](https://reader033.vdocuments.site/reader033/viewer/2022052000/6012d29bce02d15e58712005/html5/thumbnails/17.jpg)
© Fraunhofer IIS 17 08.09.2017
5. EvaluationListening Test
Listening test with multiple stimuli using loudspeakers. Conditions: Coded signal without any postprocessing, as known and hidden “reference”, Stereo With Enhancement (SWE), Ambient Sound Enhancement (ASE), SWE + ASE.
5 test signals of length between 8 s and 30 s each, loudness normalized (ITU‐R BS.1770).
Codecs: mp3 at 64kbps, AAC at 48 kbps.
12 listeners.
![Page 18: Methods for Low Bitrate Coding Enhancement Part II: Spatial … · 2017. 9. 22. · Stereo With Enhancement (SWE), Ambient Sound Enhancement (ASE), SWE + ASE. 5test signals of length](https://reader033.vdocuments.site/reader033/viewer/2022052000/6012d29bce02d15e58712005/html5/thumbnails/18.jpg)
© Fraunhofer IIS 18 08.09.2017
1. “How well the spatial image has been improved?”2. “Sound quality?”
5. EvaluationListening Test (2)
![Page 19: Methods for Low Bitrate Coding Enhancement Part II: Spatial … · 2017. 9. 22. · Stereo With Enhancement (SWE), Ambient Sound Enhancement (ASE), SWE + ASE. 5test signals of length](https://reader033.vdocuments.site/reader033/viewer/2022052000/6012d29bce02d15e58712005/html5/thumbnails/19.jpg)
© Fraunhofer IIS 19 08.09.2017
1. “How well the spatial image has been improved?”2. “Sound quality?”
5. EvaluationListening Test (2)
![Page 20: Methods for Low Bitrate Coding Enhancement Part II: Spatial … · 2017. 9. 22. · Stereo With Enhancement (SWE), Ambient Sound Enhancement (ASE), SWE + ASE. 5test signals of length](https://reader033.vdocuments.site/reader033/viewer/2022052000/6012d29bce02d15e58712005/html5/thumbnails/20.jpg)
© Fraunhofer IIS 20 08.09.2017
6. Conclusion
In perceptual audio coding, audible artifacts can be introduced when the bitrate is too low.
We have proposed a suite of algorithms each designed for mitigating common types of artifact.
Listening test: Both methods achieved a significant improvement, The combination of both methods is rated higher than the methods in isolation
(“slightly better”). These tools can be used to implement a Low Bitrate Coding Enhancement system. Future work: Assessment of the performance obtained with a combination of all proposed
enhancement tools (presented in Part 1 and Part 2).
![Page 21: Methods for Low Bitrate Coding Enhancement Part II: Spatial … · 2017. 9. 22. · Stereo With Enhancement (SWE), Ambient Sound Enhancement (ASE), SWE + ASE. 5test signals of length](https://reader033.vdocuments.site/reader033/viewer/2022052000/6012d29bce02d15e58712005/html5/thumbnails/21.jpg)
© Fraunhofer IIS 21 08.09.2017
Thank you for your attention!
![Page 22: Methods for Low Bitrate Coding Enhancement Part II: Spatial … · 2017. 9. 22. · Stereo With Enhancement (SWE), Ambient Sound Enhancement (ASE), SWE + ASE. 5test signals of length](https://reader033.vdocuments.site/reader033/viewer/2022052000/6012d29bce02d15e58712005/html5/thumbnails/22.jpg)
© Fraunhofer IIS 22 08.09.2017
Sonamic Enhancement Sound Demo
In Regency Ballroom
Listen also to
Symphoria 3D
Sonamic Loudness