w.a.v.s. compression alex chen nader shehad aamir virani erik welsh

21
W.A.V.S. Compression Alex Chen Nader Shehad Aamir Virani Erik Welsh

Upload: john-mcpherson

Post on 26-Mar-2015

228 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: W.A.V.S. Compression Alex Chen Nader Shehad Aamir Virani Erik Welsh

W.A.V.S. Compression

Alex ChenNader ShehadAamir ViraniErik Welsh

Page 2: W.A.V.S. Compression Alex Chen Nader Shehad Aamir Virani Erik Welsh

Overview Approach Psychoacoustic Modeling Filter Banks Quantization Demonstration Results Further Research

Page 3: W.A.V.S. Compression Alex Chen Nader Shehad Aamir Virani Erik Welsh

Approach

Filter Banks

PsychoacousticModel

Quantization

InverseQuantization

ReconstructionFilter Banks

Encoding:

Input EncodedSignal

Decoding:

EncodedSignal

Output

Page 4: W.A.V.S. Compression Alex Chen Nader Shehad Aamir Virani Erik Welsh

Psychoacoustic Model Based on studies that show hearing

capabilities affected by: Environment Limitations of human auditory system

Used to eliminate portions of signal average human won’t hear

Two key properties: Absolute threshold of hearing Auditory masking

Page 5: W.A.V.S. Compression Alex Chen Nader Shehad Aamir Virani Erik Welsh

Absolute Threshold of Hearing Experiment:

Plot audible threshold of tone

Observations: Auditory system

sensitive to some frequencies

Frequencies within “critical bandwidth” treated similarly

Basis for Bark scale

Page 6: W.A.V.S. Compression Alex Chen Nader Shehad Aamir Virani Erik Welsh

Auditory Masking Tones and noise

drown out less powerful sounds Affect neighboring

frequencies Affect critical

bandwidth Effects add to

produce overall masking threshold Mask quantization

Page 7: W.A.V.S. Compression Alex Chen Nader Shehad Aamir Virani Erik Welsh

Filter Banks Theory

Array of bandpass filters Break up signal into frequency subbands Allows for variable coding scheme

Page 8: W.A.V.S. Compression Alex Chen Nader Shehad Aamir Virani Erik Welsh

Analysis and Synthesis Banks

1) Analysis filters divide up the signal2) Down-sample3) Quantize

4) Up-sample5) Synthesis filters remove distortions6) Reconstruct the signal

Page 9: W.A.V.S. Compression Alex Chen Nader Shehad Aamir Virani Erik Welsh

Filter Bank Design Phase Tradeoff between fine and coarse

frequency resolution Piccolo vs. Castanets Non-stationary signals We used non-adaptive approach

Page 10: W.A.V.S. Compression Alex Chen Nader Shehad Aamir Virani Erik Welsh

Filter Bank Implementation We used Cosine Modulated PR

(perfect reconstruction) filter banks with 32 filters each

Output is a delayed version of the input (linear phase)

Distortion arises from quantization only

Page 11: W.A.V.S. Compression Alex Chen Nader Shehad Aamir Virani Erik Welsh

Quantization

Two types Narrow-band

Current input Overhead cost

Full-range Independent of

current input No overhead

Sampled Input

Quantized Version

Reconstructed Input

Page 12: W.A.V.S. Compression Alex Chen Nader Shehad Aamir Virani Erik Welsh

Quantization Narrow Band

More accurate Lower

compression ratio Full-Range

Less accurate Higher

compression ratio

Using 3-bit Quantization

Input: -.4 -.22 .14 .4

Levels: 1 3 6 8 Recon.: -.4 -.2 .1 .3 Total Error: .16

Input: -.4 -.22 .14 .4Output: 3 4 6 7Recon: -.5 -.25 .25 .50Total Error: .34

Page 13: W.A.V.S. Compression Alex Chen Nader Shehad Aamir Virani Erik Welsh

Demonstration Sine wave

Full range Narrow range

Chime 8-bit Full range Narrow range

Percussion Full Range Narrow Range

Modern 8-bit Full Range Narrow Range

Page 14: W.A.V.S. Compression Alex Chen Nader Shehad Aamir Virani Erik Welsh

Sine Wave (time)

Full-Range Quantization Narrow Quantization

Page 15: W.A.V.S. Compression Alex Chen Nader Shehad Aamir Virani Erik Welsh

Sine Wave (freq)

Full-Range Quantization Narrow Quantization

Page 16: W.A.V.S. Compression Alex Chen Nader Shehad Aamir Virani Erik Welsh

Sine Wave (freq error)

Full-Range Quantization Narrow Quantization

Page 17: W.A.V.S. Compression Alex Chen Nader Shehad Aamir Virani Erik Welsh

Modern (time)

Full-Range Quantization Narrow Quantization

Page 18: W.A.V.S. Compression Alex Chen Nader Shehad Aamir Virani Erik Welsh

Modern (freq)

Full-Range Quantization Narrow Quantization

Page 19: W.A.V.S. Compression Alex Chen Nader Shehad Aamir Virani Erik Welsh

Modern (freq error)

Full-Range Quantization Narrow Quantization

Page 20: W.A.V.S. Compression Alex Chen Nader Shehad Aamir Virani Erik Welsh

Results

Full Range: Smallest File, Worst Sound Quality Narrow Range: Better Sound Quality, Larger File MP3: Industry Standard

Data Full-Range Narrow 8bit 16bit Original (16*numsamples) Original wav Mp3Bytes Bytes Bytes Bytes Bytes Bytes Bytes

Pure sine 14854 18168 14000 27600 24000 24044 50152 separate sines 14374 17656 14000 27600 24000 24044 50152 near sines 14830 18142 14000 27600 24000 24044 5015percussion 261726 311542 237584 X 460000 460044 84427chimes 72330 89316 70000 X 120002 120046 37198Modern 252692 301260 246576 X 449232 449276 82755

Page 21: W.A.V.S. Compression Alex Chen Nader Shehad Aamir Virani Erik Welsh

Further Research Filter Banks

Wavelets Dynamic Frequency Ranges

Better Psychoacoustic Model Tone Designation Pre- and Post- Echo

Bit Allocation Writing a File