mpeg-3 for audio presented by: chun lui sunjeev sikand

MPEG-3 For Audio

Presented by:

Chun Lui

Sunjeev Sikand

History of MP3• In 1987, the Fraunhofer IIS started to work on

perceptual audio coding in the framework of the EUREKA project EU147, Digital Audio Broadcasting (DAB).

• In a joint cooperation with the University of Erlangen (Prof. Dieter Seitzer), the Fraunhofer IIS finally devised a very powerful algorithm that is standardized as ISO-MPEG Audio Layer-3 (IS 11172-3 and IS 13818-3).

• MPEG-3 =Moving Picture Experts Group Audio Layer – 3

Background: How CD stores music

• Music is sampled at 44,100 times/second• Each sample is 2 bytes• Sample is taken separately for left and right

speakers• 44,100 samples/sec * 16 bits/sample * 2 channels

= 1,411,200 b/sec• 3-minute song - 1,411,200 b/sec * 180 sec =

31,752,000 bytes

• Yikes!

The 3 MPEG layersThree Layers, Three Applications

NameCompression Factor

Bit rate Application

Layer 1 1:4 384 Kbit/sec. Digital Compact Cassettes

Layer 2 1:6 to 1:8 256 to192 Kbit/ sec. Digital Radio

Layer 3 1:10 to 1:12 128 to 112 Kbit/ sec. Digital Internet Music

MP3 PerformanceAn Overview of MP3 Quality Levels

Sound Quality Mode Bit rate Compressions Rate

Telephone mono 8 Kbit/s 96:1

Better than SW Radio mono 16 Kbit/s 48:1

Better than MW Radio

mono 32 Kbit/s 24:1

Similar to VHF Radio stereo 56 to 64 Kbit/s 26 to 24:1

Similar to CD stereo 96 Kbit/s 16:1

CD quality stereo 112 to 128 Kbit/s 14 to 12:1

How Does It Work?

• Every MP3 encoder uses two approaches

• First it compresses the analog audio stream with the help of perceptual noise shaping.

• Second, the compressed and frequency cleansed data is shrunk again using Huffmann encoding.

Perceptual Noise Shaping

• Much like YUV encoding and other compression scheme, it takes advantage of Psychoacoustics…

• Human ear cannot distinguish the difference between two similar frequencies

• There are sounds that the human ear cannot hear• There are sounds that the human ear hears much

better than others• If there are two sounds playing simultaneously,

higher volume mask lower volume

Perceptual Noise Shaping Example

• If there is a louder sound in one band, don’t really need to encode all the other bands

Encoding a Wave File in MP3

Analytical Filter

• The audio signal passes through a filter bank which divides the audio signal in 576 areas (sub-bands). This requires very complex filters. Here, MP3 encoders work with the well-known Discrete Cosine Transformation.

• In real time it calculates unnecessary frequencies and eliminates them iteratively (repeatedly) until the best possible result is achieved.

Masking Threshold Evaluation

• At the same time, the audio signal passes through the psychoacoustic model. For every sub-band of the entire signal spectrum, the masking threshold is determined using the Discrete Fourier Transformation.

• Joint stereo coding can then be done to take exploit the fact that both channels of a stereo channel pair contain by far the same information. These stereophonic irrelevancies and redundancies are exploited to reduce the total bitrate.

Quanitzation and Encoding

• When quantizing the sample another starting point for data reduction arises. Every sample is made up of 16 bits, but not all 16 are necessarily needed in order to represent the sound. As such, the leading nulls of a 16-bit sample may be left out

• At the same time, individual samples are analyzed and compressed again using Huffman encoding. This produces a further reduction of data of about 20 percent.

Bitstream creation

• Now all the data has been gathered.• Everything is recorded and digitalized. • Finally, the encoder forms the bit stream,

which ultimately represents the MP3 file: the compressed data is compiled into so-called frames.

• For MP3s, there are 1152 scanning values per frame (32 Sub-bands * 36 Samples). Every frame consists of a header, a sum test check, the audio data and sometimes a bit-reservoir.

MP3 Bit Rate

• Measure at bits per second

• Generally speaking the higher the bit rate of a MP3 file the higher quality it is

• There are 3 types of bit rate format:

– Constant Bit Rate (CBR)

– Variable Bit Rate (VBR)

– Average Bit Rate (ABR)

Constant Bit Rate

• Same bit rate for the entire file

• Stream an audio more efficiently

• Quality of the encoded content is not constant– Because some content is harder to compress

than others– Different song encoded with same bit rate may

result in different quality

Variable Bit Rate

• Try to achieve best quality

• Use different bit rate for each frame of the MP3 file

• Use a higher bit rate if needed

• Better quality

• Result with a larger file size

• Many MP3 players do not support VBR

Average Bit Rate

• A little like VBR

• Use the average bit rate of all the bit rates VBR would use

• Dynamically change the for better quality

• The end file size is known

• Global quality is slightly lower than VBR

MP3 Competition

• Window Media Format - CD quality at half the bit rate

• Real Audio - Designed for lower bit-rate

• Liquid Audio - implements more security so less appealing to users

Acknowledgements

• Fraunhofer IIS:

http://www.iis.fraunhofer.de/amm/techinf/layer3/index.html

• Intel: http://www.intel.com/english/home/maximize/article/mp3/how/index.htm

• Koning, Verhelst. ON PSYCHOACOUSTIC NOISE SHAPING FOR AUDIO REQUANTIZATION. Vrije Universiteit Brussel.

• msdn.microsoft.com/library



mpeg-3 for audio presented by: chun lui sunjeev sikand

Documents