the mpeg-4 general audio coder€¦ · scalable ga coder : combination with celp coder (i) • very...

24
grl 6/98 page 1 115 FRAUNHOfER Institut Integrierte Schaltungen The MPEG-4 General Audio Coder The MPEG-4 General Audio Coder Bernhard Grill Fraunhofer Institute for Integrated Circuits (IIS)

Upload: others

Post on 19-Jan-2021

15 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: The MPEG-4 General Audio Coder€¦ · Scalable GA Coder : Combination with CELP Coder (I) • Very low bitrate core coder ( e.g. speech coder) • Core coder typically operating

grl 6/98 page 1

115

FRAUNHOfER InstitutIntegrierte Schaltungen

The MPEG-4 General Audio Coder

The MPEG-4 General Audio Coder

Bernhard GrillFraunhofer Institute for Integrated Circuits (IIS)

Page 2: The MPEG-4 General Audio Coder€¦ · Scalable GA Coder : Combination with CELP Coder (I) • Very low bitrate core coder ( e.g. speech coder) • Core coder typically operating

grl 6/98 page 2

115

FRAUNHOfER InstitutIntegrierte Schaltungen

The MPEG-4 General Audio Coder

Outline • MPEG-2 Advanced Audio Coding (AAC)

• MPEG-4 Extensions:– Perceptual Noise Substitution (PNS)

– Long Term Prediction

– TwinVQ Coding Core

• The MPEG-4 Scalable General Audio Coder

• Results of Listening Tests

• Demonstration of a Real-Time Player

Page 3: The MPEG-4 General Audio Coder€¦ · Scalable GA Coder : Combination with CELP Coder (I) • Very low bitrate core coder ( e.g. speech coder) • Core coder typically operating

grl 6/98 page 3

115

FRAUNHOfER InstitutIntegrierte Schaltungen

The MPEG-4 General Audio Coder

MPEG-2 AACIn

put t

ime

sign

al Inte

nsity

/C

oupl

ing

Bitstream Multiplexer

Perceptual Model

Gain Control

Filter Bank

TNS M/S

Rate/Distortion Control

Noi

sele

ss

Cod

ing

Bitstream Output

Sca

leF

acto

rs

Pre

dict

ion

Quant.

EncoderOverview:

Page 4: The MPEG-4 General Audio Coder€¦ · Scalable GA Coder : Combination with CELP Coder (I) • Very low bitrate core coder ( e.g. speech coder) • Core coder typically operating

grl 6/98 page 4

115

FRAUNHOfER InstitutIntegrierte Schaltungen

The MPEG-4 General Audio Coder

Inpu

t tim

e si

gnal In

tens

ity/

Cou

plin

g

Bitstream Multiplexer

Perceptual Model

Gain Control

Filter Bank

TNS M/S

Rate/Distortion Control

Noi

sele

ss

Cod

ing

Bitstream Output

Sca

leF

acto

rs

Pre

dict

ion

Quant.

Extension: Perceptual Noise Substitution (PNS)

PNS

Page 5: The MPEG-4 General Audio Coder€¦ · Scalable GA Coder : Combination with CELP Coder (I) • Very low bitrate core coder ( e.g. speech coder) • Core coder typically operating

grl 6/98 page 5

115

FRAUNHOfER InstitutIntegrierte Schaltungen

The MPEG-4 General Audio Coder

Perceptual Noise Substitution (2)• Parametric coding of noise-like signal components has been used

widely e.g. in speech coding

• Perceptual Noise Substitution (PNS) permits a frequency selectiveparametric coding of noise-like signal components

• Noise-like signal components are detected on a scalefactor band basis

• Corresponding groups of spectral coefficients are excluded fromquantization/coding

• Instead, only a "noise substitution flag" plus total power of thesubstituted band is transmitted in the bitstream

• Decoder inserts pseudo random vectors with desired target power asspectral coefficients

Background:

MPEG-4:

Page 6: The MPEG-4 General Audio Coder€¦ · Scalable GA Coder : Combination with CELP Coder (I) • Very low bitrate core coder ( e.g. speech coder) • Core coder typically operating

grl 6/98 page 6

115

FRAUNHOfER InstitutIntegrierte Schaltungen

The MPEG-4 General Audio Coder

Perceptual Noise Substitution (3)"Perceptual Noise

Substitution" (PNS):

Perceptual coder +

parametric represent.

of noise-like signals

Quantization & Coding

Bitstream Multiplexer

Analysis Filterbank

Perceptual Model

Bitstream

Out

Audio

Input

Inverse Quantization

Synthesis Filterbank

Audio

Output

Bitstream

In

Bitstream Demultiplexer

Encoder

Decoder

Noise Detection

Noise subst. signaling

Substituted signal energies

Noise Generator

Noise subst. signaling Substituted signal energies

Page 7: The MPEG-4 General Audio Coder€¦ · Scalable GA Coder : Combination with CELP Coder (I) • Very low bitrate core coder ( e.g. speech coder) • Core coder typically operating

grl 6/98 page 7

115

FRAUNHOfER InstitutIntegrierte Schaltungen

The MPEG-4 General Audio Coder

Inpu

t tim

e si

gnal In

tens

ity/

Cou

plin

g

Bitstream Multiplexer

Perceptual Model

Gain Control

Filter Bank

TNS M/S

Rate/Distortion Control

Noi

sele

ss

Cod

ing

Bitstream Output

Sca

leF

acto

rs

Pre

dict

ion

Quant.

Extension 2: Long Term Prediction

Page 8: The MPEG-4 General Audio Coder€¦ · Scalable GA Coder : Combination with CELP Coder (I) • Very low bitrate core coder ( e.g. speech coder) • Core coder typically operating

grl 6/98 page 8

115

FRAUNHOfER InstitutIntegrierte Schaltungen

The MPEG-4 General Audio Coder

Long Term Prediction (2)• Tone-like signals require much higher coding precision than noise-like

signals (e.g. 20 dB vs. 6 dB)

• Tonal signal components are predictable

• Prediction of each spectral coefficient with backward adaptive predictor

• High complexity (ca. 50% of decoder computation & RAM)

• Long Term Predictor (LTP) as known from speech coding

• Lower complexity: Saving of approx. 50% in terms of computation andmemory over MPEG-2 predictors

• Comparable performance to MPEG-2 predictors

Motivation:

MPEG-2 AAC:

MPEG-4:

Page 9: The MPEG-4 General Audio Coder€¦ · Scalable GA Coder : Combination with CELP Coder (I) • Very low bitrate core coder ( e.g. speech coder) • Core coder typically operating

grl 6/98 page 9

115

FRAUNHOfER InstitutIntegrierte Schaltungen

The MPEG-4 General Audio Coder

Extension 3: Twin-VQIn

put t

ime

sign

al Inte

nsity

/C

oupl

ing

Bitstream Multiplexer

Perceptual Model

Gain Control

Filter Bank

TNS M/S

Rate/Distortion Control

Noi

sele

ss

Cod

ing

Bitstream Output

Sca

leF

acto

rs

Pre

dict

ion

Quant.

Page 10: The MPEG-4 General Audio Coder€¦ · Scalable GA Coder : Combination with CELP Coder (I) • Very low bitrate core coder ( e.g. speech coder) • Core coder typically operating

grl 6/98 page 10

115

FRAUNHOfER InstitutIntegrierte Schaltungen

The MPEG-4 General Audio Coder

Transform-Domain Weighted Interleave VQ (2)• Audio coding at extremely low bitrates (6-8 kbit/s)

• CELP speech coders do not perform well for music

• 0.5 Bits per frequency line at these data rates !!

• Transform-Domain Weighted Interleave Vector Quantization (TwinVQ)as alternative coding kernel

– Vector selection under control of the perceptual model

• Fully integrated into MPEG-4 AAC coding system:

– Uses same spectral representation as AAC coder

– Makes use of other MPEG-4 tools(e.g. LTP, TNS, joint stereo coding)

Background:

MPEG-4:

Page 11: The MPEG-4 General Audio Coder€¦ · Scalable GA Coder : Combination with CELP Coder (I) • Very low bitrate core coder ( e.g. speech coder) • Core coder typically operating

grl 6/98 page 11

115

FRAUNHOfER InstitutIntegrierte Schaltungen

The MPEG-4 General Audio Coder

Transform-Domain Weighted Interleave VQ (3)• Normalization of spectral coefficients:

– LPC envelope (overall spectral shape)

– Periodic component coding (harmonic components)

– Bark-scale envelope coding (additional flattening)

• Vector Quantization (VQ) process:

– Interleaving of spectral coefficients into new sub-vectors

– Vector quantization(two sets of codebooks, weighted distortion measure allowsdistortion control by perceptual model)

⇒ no bit/noise allocation or rate control iteration

Structure:

Page 12: The MPEG-4 General Audio Coder€¦ · Scalable GA Coder : Combination with CELP Coder (I) • Very low bitrate core coder ( e.g. speech coder) • Core coder typically operating

grl 6/98 page 12

115

FRAUNHOfER InstitutIntegrierte Schaltungen

The MPEG-4 General Audio Coder

Definition:

Types ofScalability:

• Capability to decode useful sub-sets of the bitstream

• SNR / NMR (Noise to Mask Ratio) Scalability:

– “Extension layers improve the SNR/NMR of the coded signal”

• Audio Bandwidth Scalability:

– “Extension layers increase the decodable audio band width”

• Restriction of Generality:

– Very low bit rate core coder optimized for special signals, e.g speech.Additional layers provide good quality for all types of signals.

• Implementation Complexity:

Scalability

Page 13: The MPEG-4 General Audio Coder€¦ · Scalable GA Coder : Combination with CELP Coder (I) • Very low bitrate core coder ( e.g. speech coder) • Core coder typically operating

grl 6/98 page 13

115

FRAUNHOfER InstitutIntegrierte Schaltungen

The MPEG-4 General Audio Coder

Application examples• Network based (packetized) transmission

– Requires routers which know about the importance of a packet

– Less important (outer layer) packets may be dropped if the availablebandwidth decreases

• Broadcast

– The most important (inner layer) packets are transmitted with abetter error protection scheme

• Music data base

– High quality content is encoded and stored

– Access to a lower quality version is possible without recoding toallow for pre-listening with a lower quality

Page 14: The MPEG-4 General Audio Coder€¦ · Scalable GA Coder : Combination with CELP Coder (I) • Very low bitrate core coder ( e.g. speech coder) • Core coder typically operating

grl 6/98 page 14

115

FRAUNHOfER InstitutIntegrierte Schaltungen

The MPEG-4 General Audio Coder

Scalable GA Coder (I)

• Encoding of the error signal of an AAC or Twin-VQ Quantizationand Coding (Q&C) module in a second, or third, or n-th similarquantization module in the frequency domain

• Solutions using only AAC, or only Twin-VQ modules possible

• Additionally, Twin-VQ / AAC combinations defined

• Useful for large enhancement steps of >= 8 kbit/s per step

MDCT

Perceptual Model

Q&C ReconstructSpectrum Q&C+

-FSS

Encoder Block Diagram

Page 15: The MPEG-4 General Audio Coder€¦ · Scalable GA Coder : Combination with CELP Coder (I) • Very low bitrate core coder ( e.g. speech coder) • Core coder typically operating

grl 6/98 page 15

115

FRAUNHOfER InstitutIntegrierte Schaltungen

The MPEG-4 General Audio Coder

Scalable GA Coder (II)• Twin-VQ Q&C Modules

– 8 kbit/s fixed step size Vector Quantizer (VQ) modules

– optional 6 kbit/s in first layer

– first choice for a 6 or 8 kbit/s base layer for the codingof general audio signals

• AAC Q&C modules

– Any step size possible

– Reasonable step sizes from 8 to >64 kbit/s

– The same end quality can be achieved as from a singlestep AAC coder

– However, a higher bit rate may be required for the sameaudio quality

Decoder Block Diagram

ReconstructSpectrum

ReconstructSpectrum Inv.

FSS +Add

IMDCT

Page 16: The MPEG-4 General Audio Coder€¦ · Scalable GA Coder : Combination with CELP Coder (I) • Very low bitrate core coder ( e.g. speech coder) • Core coder typically operating

grl 6/98 page 16

115

FRAUNHOfER InstitutIntegrierte Schaltungen

The MPEG-4 General Audio Coder

Scalable GA Coder : Combination with CELP Coder (I)

• Very low bitrate core coder ( e.g. speech coder)

• Core coder typically operating at a lower sampling frequency

• MDCT used for efficient up-sampling

MDCT

MDCT

CELPCODEC FSS

Perceptual. model

Quantization& Coding

Encoder

Page 17: The MPEG-4 General Audio Coder€¦ · Scalable GA Coder : Combination with CELP Coder (I) • Very low bitrate core coder ( e.g. speech coder) • Core coder typically operating

grl 6/98 page 17

115

FRAUNHOfER InstitutIntegrierte Schaltungen

The MPEG-4 General Audio Coder

Scalable GA Coder : Combination with Core Coder (II)

IMDCTMDCTCELPDECODER IFSS

Decoder

Re-quantization

Re-quantization

+

Page 18: The MPEG-4 General Audio Coder€¦ · Scalable GA Coder : Combination with CELP Coder (I) • Very low bitrate core coder ( e.g. speech coder) • Core coder typically operating

grl 6/98 page 18

115

FRAUNHOfER InstitutIntegrierte Schaltungen

The MPEG-4 General Audio Coder

Scalable Stereo Coding: Stereo / Stereo

Page 19: The MPEG-4 General Audio Coder€¦ · Scalable GA Coder : Combination with CELP Coder (I) • Very low bitrate core coder ( e.g. speech coder) • Core coder typically operating

grl 6/98 page 19

115

FRAUNHOfER InstitutIntegrierte Schaltungen

The MPEG-4 General Audio Coder

Scalable Stereo Coding: Mono / Stereo

Page 20: The MPEG-4 General Audio Coder€¦ · Scalable GA Coder : Combination with CELP Coder (I) • Very low bitrate core coder ( e.g. speech coder) • Core coder typically operating

grl 6/98 page 20

115

FRAUNHOfER InstitutIntegrierte Schaltungen

The MPEG-4 General Audio Coder

Scalable Stereo Coding: Mono Core / Mono GA / Stereo GA

Page 21: The MPEG-4 General Audio Coder€¦ · Scalable GA Coder : Combination with CELP Coder (I) • Very low bitrate core coder ( e.g. speech coder) • Core coder typically operating

grl 6/98 page 21

115

FRAUNHOfER InstitutIntegrierte Schaltungen

The MPEG-4 General Audio Coder

Scalable GA Coder : Typical Configurations• Some successfully tested mono/mono combinations:

6 kbit/s CELP + 18 kbit/s AAC6 kbit/s TwinVQ + 18 kbit/s AAC8 kbit/s TwinVQ + 8 kbit/s TwinVQ

6 kbit/s CELP + 18 kbit/s + 24 kbit/s AAC

• Mono/stereo combinations

6 kbit/s mono CELP + 18 kbit/s mono + 24 kbit/s stereo AAC24 kbit/s mono + 16 kbit/s stereo + 16 kbit/s stereo AAC24 kbit/s mono + 72 kbit/s stereo AAC

• Stereo/stereo combinations

2 x 6 kbit/s mono CELP + 36 kbit/s stereo AAC

Page 22: The MPEG-4 General Audio Coder€¦ · Scalable GA Coder : Combination with CELP Coder (I) • Very low bitrate core coder ( e.g. speech coder) • Core coder typically operating

grl 6/98 page 22

115

FRAUNHOfER InstitutIntegrierte Schaltungen

The MPEG-4 General Audio Coder

Results (I) Mono Configurations

0

0,5

1

1,5

2

2,5

3

3,5

4

4,5

Ser ies 1

Ser ies 2 3,5 4,1 3,85 3,65 3,27

Lay er 3 24 kb it/s

A A C 24 kb it/s

CELP+AA C 24 kb it/s

Tw inV Q + A A C

24 kb it/s

A A C 18 kb it/s

Page 23: The MPEG-4 General Audio Coder€¦ · Scalable GA Coder : Combination with CELP Coder (I) • Very low bitrate core coder ( e.g. speech coder) • Core coder typically operating

grl 6/98 page 23

115

FRAUNHOfER InstitutIntegrierte Schaltungen

The MPEG-4 General Audio Coder

Results (II) Mono / Stereo Configuration

0

0,5

1

1,5

2

2,5

3

3,5

4

4,5

5

L3 24k bit/s

A A C24

k bit/s

s c al24 24k bit/s

L3 40k bit/s

A A C40

k bit/s

s c al40

k bit/s

L3 56k bit/s

A A C56

k bit/s

s c al56

k bit/s

Mono Stereo

Page 24: The MPEG-4 General Audio Coder€¦ · Scalable GA Coder : Combination with CELP Coder (I) • Very low bitrate core coder ( e.g. speech coder) • Core coder typically operating

grl 6/98 page 24

115

FRAUNHOfER InstitutIntegrierte Schaltungen

The MPEG-4 General Audio Coder

Conclusions • Highest quality coding with proven AAC technology

• PNS, LTP and TwinVQ further enhance the very low bitrateperformance

• Mono, Stereo, and Multi-channel Stereo supported

• Bitrate range 6 - ~300 kbit/s per channel at 8 - 96 kHz SR

• Additional flexibility with the scalable coding modes

– Unique capabilities through the availability of the mono-stereo coding modes

• Overall complexity within the limits of today’s hardware

• ==>

• The MPEG-4 GA coder the most versatile audio codingsystem available today

• Low-Delay and Error Resilience Additions inMPEG-4 Version 2