audiocompression&mp3standard

Upload: gdv-santhosh-kumar

Post on 05-Apr-2018

221 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/2/2019 AudioCompression&MP3Standard

    1/12

    MPEG, the MP3 Standard,

    and Audio Compression

    Mark Kilgore and Jamie Wu

    Mathematics of the Information Age

    September 16, 2003

    Audio Compression

    n Basic Audio Coding.

    n Why beneficial to compress?

    n Lossless versus Lossy Compression.

    n How are MP3s Compressed?

    n What makes MP3 Compression Different?n What other formats lie in our future?

  • 8/2/2019 AudioCompression&MP3Standard

    2/12

    PCM

    Why Compress??

    n Eliminate redundancy

    n Most basic encoder/decoder is PCM

    n Lots of redundancy b/c PCM representation is a basicsine wave

    n If representing the sine wave based on frequencyrather than time, only need to store information

    regarding frequency, amplitude, and phase in orderto represent the information

    n Can reduce data without information loss

    n Extends playing time, Allows for miniaturization andgreater equipment tolerance, Reduces cost

  • 8/2/2019 AudioCompression&MP3Standard

    3/12

    Lossless vs. Lossy (Perceptive)

    n Lossless coding allows perfect reconstructionof a signal (theoretically)

    n Lossy Coding creates a more highly

    compressed signal, but some unnecessaryfrequencies are eliminated

    n Perceptually, however, lossy coding results in

    no difference in how it SOUNDS to a person

    n MP3s are lossy, but perceptually lossless

    MPEG

    n Moving Picture Experts Group

    n Aim to create standards relating to synchronizedaudio and video compression

    n MPEG-1

    n MPEG-2

  • 8/2/2019 AudioCompression&MP3Standard

    4/12

    MPEG-1 Block Diagrams

    Topics Discussed in Detail After Diagrams

    Layers I and II

    Filter Bank (32

    Sub-Bands)

    0

    31

    DFT 512/1024

    Hann WindowPsychoacoustic

    Model

    Uniform MidtreadQuanitzer

    Coding of SideInformation

    BitstreamFormatting

    CodedAudio

    Data

  • 8/2/2019 AudioCompression&MP3Standard

    5/12

    DFT 2 * 1024

    Hann Window

    Filter Bank (32

    Sub-Bands)

    0

    31 MDCT

    PsychoacousticModel

    Non-UniformMidtread Quantizer

    Rate/Distortion Loop

    0

    511

    Huffman Coding

    Coding of SideInformation

    BitstreamFormatting

    CodedAudio

    Data

    Layer III

    Time to Frequency Mapping

    n Filters parse signal to K bands

    n Quantized to a limited number of bits

    n Noise put in bands barely audible

    n Sent to decoder where sound is restored

    x

    H0

    HK

    K

    K

    InputOutputy0

    yK

    y0

    yK

    K

    K

    G0

    GK

    Encoder Decoder

    x

  • 8/2/2019 AudioCompression&MP3Standard

    6/12

    Z Transform

    n Assists in splitting frequencies

    n Discrete Time generalization of the Fouriertransform

    n Important Properties

    n Linearity

    n Convolution Theorem

    n Delay Theorem

    n Can model all kinds of filter banks through it

    n Representation of frequency content

    DFT 2 * 1024Hann Window

    Filter Bank (32

    Sub-Bands)

    0

    31 MDCT

    PsychoacousticModel

    Non-UniformMidtread Quantizer

    Rate/Distortion Loop

    0

    511

    Huffman Coding

    Coding of SideInformation

    BitstreamFormatting

    CodedAudio

    Data

    Layer III

  • 8/2/2019 AudioCompression&MP3Standard

    7/12

    Time to Frequency Mapping

    n Filters parse signal to K bands

    n Quantized to a limited number of bits

    n Noise put in bands barely audible

    n Sent to decoder where sound is restored

    x

    H0

    HK

    K

    K

    InputOutputy0

    yK

    y0

    yK

    K

    K

    G0

    GK

    Encoder Decoder

    x

    MPEG Time to Frequency Mapping

    [ ] [ ] ( ) 32

    162

    1cos

    +

    +=

    nknhnhk [ ] [ ] ( )

    +

    +=

    3216

    2

    1cos32

    nknhngk

    n Uses a filter of 32 bands, signal represented by 512samples

    n The above equations allow for taking apart the signal(the h part of the time to frequency mapping diagram)and putting it back together (the g part of the time tofrequency mapping diagram)

    Analysis Filter: Synthesis Filter:

    511,,1,0;31,,1,0 KK == nk

  • 8/2/2019 AudioCompression&MP3Standard

    8/12

    DFT 2 * 1024

    Hann Window

    Filter Bank (32

    Sub-Bands)

    0

    31 MDCT

    PsychoacousticModel

    Non-UniformMidtread Quantizer

    Rate/Distortion Loop

    0

    511

    Huffman Coding

    Coding of SideInformation

    BitstreamFormatting

    CodedAudio

    Data

    Layer III

    PQMF & MDCT

    n Both are methods of time to frequency mapping

    n Pseudo-Quadrature Mirror Function

    n Multiple Discrete Cosine Transformation

    n Mathematically, they are equivalent

    n PQMF involves using Z transforms to representthe amplitudes of the frequency

    n MDCT involves performing a block transformusing a window to represent amplitudes

    n These amplitudes are then quantized

  • 8/2/2019 AudioCompression&MP3Standard

    9/12

    DFT 2 * 1024

    Hann Window

    Filter Bank (32

    Sub-Bands)

    0

    31 MDCT

    PsychoacousticModel

    Non-UniformMidtread Quantizer

    Rate/Distortion Loop

    0

    511

    Huffman Coding

    Coding of SideInformation

    BitstreamFormatting

    CodedAudio

    Data

    Layer III

    Pyschoacoustic Model

    n determines masking threshold for each sub band

    n Uses human auditory property of AuditoryMasking

  • 8/2/2019 AudioCompression&MP3Standard

    10/12

    Non-uniform Quantizer

    n Analog to digital

    n Quantizer: Maps amplitude values into finitenumber of bits

    n Non-uniform: changes sample size according

    to amplitude values

    n parts of signal with lesser amplitude codedwith greater accuracy increases signal to

    noise ratio (SNR)

    DFT 2 * 1024Hann Window

    Filter Bank (32

    Sub-Bands)

    0

    31 MDCT

    PsychoacousticModel

    Non-UniformMidtread Quantizer

    Rate/Distortion Loop

    0

    511

    Huffman Coding

    Coding of SideInformation

    BitstreamFormatting

    CodedAudio

    Data

    Layer III

  • 8/2/2019 AudioCompression&MP3Standard

    11/12

    Huffman coding

    n For better data compression, variable-lengthHuffman codes are used to encode the

    quantized samples.

    n quantized MDCT coefficients (for long blocks)arranged in order from lowest to highestfrequency

    n whole range divided into 3 sections, each

    coded with a different set of Huffman tables

    Bitstream Formatting

    n formats encoded quantized samples into anencoded bitstream final form in which the

    compressed signal is transmitted.

  • 8/2/2019 AudioCompression&MP3Standard

    12/12

    MPEG-4 and The Future?

    n Incorporates speech and music compression

    n More of an extension of MPEG-2compression techniques with independent

    techniques geared specifically at coding forspeech content (some coding for meaning)

    n Hasnt really taken off yet, only time will tell

    n MPEG-2 AAC (Advanced Audio Coding) is

    the audio format that is used if you downloadfrom the apple iTunes store