submitted by: santosh kumar yadav (111432) m.e. modular(2011) under the supervision of: mrs. shano...

28
Submitted By: Santosh Kumar Yadav (111432) M.E. Modular(2011) Under the Supervision of: Mrs. Shano Solanki Assistant Professor, C.S.E NITTTR, Chandigarh Presentation On Audio Compression & Psychoacoustic 1

Upload: gervase-curtis

Post on 04-Jan-2016

217 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Submitted By: Santosh Kumar Yadav (111432) M.E. Modular(2011) Under the Supervision of: Mrs. Shano Solanki Assistant Professor, C.S.E NITTTR, Chandigarh

Submitted By:Santosh Kumar Yadav (111432)M.E. Modular(2011)

Under the Supervision of:Mrs. Shano SolankiAssistant Professor, C.S.ENITTTR, Chandigarh

Presentation On

Audio Compression&

Psychoacoustic

1

Page 2: Submitted By: Santosh Kumar Yadav (111432) M.E. Modular(2011) Under the Supervision of: Mrs. Shano Solanki Assistant Professor, C.S.E NITTTR, Chandigarh

Content

• History of Audio Compression• Basic of Audio Compression• Categorization of Audio compression• Silence compression• ADPCM• LPC• CELP• Psychoacoustics• Frequency Masking• Critical Band• Temporal Masking

2

Page 3: Submitted By: Santosh Kumar Yadav (111432) M.E. Modular(2011) Under the Supervision of: Mrs. Shano Solanki Assistant Professor, C.S.E NITTTR, Chandigarh

History of Audio Compression

• First form of audio compression came out in 1939, when Dudley first introduced the Vocoders to reduce the amount of bandwidth needed to transmit speech over a telephone line.

• In the 1960 compression was used in telephony.• Now a days various compression techniques are used in

Storage devices and with various File Formats.

3

Page 4: Submitted By: Santosh Kumar Yadav (111432) M.E. Modular(2011) Under the Supervision of: Mrs. Shano Solanki Assistant Professor, C.S.E NITTTR, Chandigarh

Basic of Audio Compression

• Compression can be accomplished using two ways:a. Take the data from a standard digital audio system and compress it using S/W.b. To encode the signal in a different system and compressed by the H/W.

• The sounds we hear are caused by variation in air pressure which are picked up by our ear.

• In an analog electronic audio system, these pressure signals are converted to a electric voltage by a microphone.

4

Page 5: Submitted By: Santosh Kumar Yadav (111432) M.E. Modular(2011) Under the Supervision of: Mrs. Shano Solanki Assistant Professor, C.S.E NITTTR, Chandigarh

Voice Pattern

5

Page 6: Submitted By: Santosh Kumar Yadav (111432) M.E. Modular(2011) Under the Supervision of: Mrs. Shano Solanki Assistant Professor, C.S.E NITTTR, Chandigarh

Quantization of Voice Signal

6

Page 7: Submitted By: Santosh Kumar Yadav (111432) M.E. Modular(2011) Under the Supervision of: Mrs. Shano Solanki Assistant Professor, C.S.E NITTTR, Chandigarh

Signal Reconstruction

7

Page 8: Submitted By: Santosh Kumar Yadav (111432) M.E. Modular(2011) Under the Supervision of: Mrs. Shano Solanki Assistant Professor, C.S.E NITTTR, Chandigarh

Categorization of Audio Compression

8

Audio Compression

Simple Audio Compression:1.Silence Compression using RLE.2.Adaptive Differential PCM.3.Linear Predicative Coding.4.Code Excited Linear Prediction.

Psychoacoustic:1.Frequency Masking2.Temporal Masking.

MPEG Audio Compression.

Page 9: Submitted By: Santosh Kumar Yadav (111432) M.E. Modular(2011) Under the Supervision of: Mrs. Shano Solanki Assistant Professor, C.S.E NITTTR, Chandigarh

Silence Compression using RLE• It is a form of lossless compression.• It is easy to implement.• Silence are replaced by the code and no of its consecutive

sequence.• Steps:

a. Determine threshold for audio data.b. If the audio level is below the threshold, will be considered as silence.c. Silence in the audio is replace by code(e.g.”0”),

• The higher the threshold level more will be compression and hence more will the loss of info.

• Silence encoding is important for human speech as it has flat pauses between the spoken words.

9

Page 10: Submitted By: Santosh Kumar Yadav (111432) M.E. Modular(2011) Under the Supervision of: Mrs. Shano Solanki Assistant Professor, C.S.E NITTTR, Chandigarh

Adaptive Differential PCM

• Used for quantization of audio signal.• Defined the scaled difference signal fn as:

• en is difference between two signals, α is multiplier constant.• fn is fed into the quantizer for quantization.

10

Page 11: Submitted By: Santosh Kumar Yadav (111432) M.E. Modular(2011) Under the Supervision of: Mrs. Shano Solanki Assistant Professor, C.S.E NITTTR, Chandigarh

Vocoders• It is Voice Coders.• Used in Linear Predictive Coding.• Used for filtering various frequency range by using sub band

filters.

• Consonants like M, N can be taken as voice as it uses vocal cord.

11

Sound

Voice(Pulse like Vowels)

Unvoice(Noise like Consonants)

Page 12: Submitted By: Santosh Kumar Yadav (111432) M.E. Modular(2011) Under the Supervision of: Mrs. Shano Solanki Assistant Professor, C.S.E NITTTR, Chandigarh

Working of Vocoder

• Pitch of period of voice is considered.• Voiced/unvoiced bit is set for voice and reset for unvoiced.• Frequency of the sound is filtered by various filters.• Signal transmitted to receiver end and then decoded there. 12

Page 13: Submitted By: Santosh Kumar Yadav (111432) M.E. Modular(2011) Under the Supervision of: Mrs. Shano Solanki Assistant Professor, C.S.E NITTTR, Chandigarh

Linear Predictive Coding

• LPC vocoders extracts salient features of speech directly from the waveform rather than transforming the signal to the frequency domain.

• Bit rate is small as sound is not sent but its analyzed attributes are sent.

• Attributes or description parameters (like gain, max and min amplitude etc).

13

Sound Signal

Segments

Sample (Speech Frames)

Page 14: Submitted By: Santosh Kumar Yadav (111432) M.E. Modular(2011) Under the Supervision of: Mrs. Shano Solanki Assistant Professor, C.S.E NITTTR, Chandigarh

Linear Predictive Coding• LPC decide whether the current segment is voiced or unvoiced.• For unvoice: Noise generator is used to create sample values f(n).• For voice: Pulse train generator is used to create sample values f(n).

• S(n) is current o/p, s(n-i) represents the previous o/p, G is gain factor, f(n) is current frame input.

• It is called linear because it consider previous output also and act linearly.

• The speech encoder works in a block-wise fashion.• Adv: Simple and easy to implement.• Disadv: Error factor in generated o/p is more.

14

Page 15: Submitted By: Santosh Kumar Yadav (111432) M.E. Modular(2011) Under the Supervision of: Mrs. Shano Solanki Assistant Professor, C.S.E NITTTR, Chandigarh

Code Excited Linear Prediction• It is more complex.• There is a code book of excitation vector to which actual

speech is matched and the index of the best match is sent to the receiver.

• This complexity increase the bitrates to 4800-9600 bps.• CELP codes has two kinds of predictions:• A. STP (Short time prediction): Predict within the sample and

remove redundancy within speech frames.• B. LTP(Long time prediction): Removes redundancy within the

segment.• Adv: It nearly produce the original sound.• Disadv: It is complex and requires more bandwidth.

15

Page 16: Submitted By: Santosh Kumar Yadav (111432) M.E. Modular(2011) Under the Supervision of: Mrs. Shano Solanki Assistant Professor, C.S.E NITTTR, Chandigarh

Psychoacoustic

• Psychoacoustics modeling referred to as perceptual coding.• Range of human hearing 20Hz to 20KHz.• Most audible range 500Hz to 4KHz.• Maximum amplitude of quietest sound human can hear is 120 dB.

16

Page 17: Submitted By: Santosh Kumar Yadav (111432) M.E. Modular(2011) Under the Supervision of: Mrs. Shano Solanki Assistant Professor, C.S.E NITTTR, Chandigarh

Equal Loudness Relation

17

Page 18: Submitted By: Santosh Kumar Yadav (111432) M.E. Modular(2011) Under the Supervision of: Mrs. Shano Solanki Assistant Professor, C.S.E NITTTR, Chandigarh

Frequency Masking

• Threshold of Hearing:

18

Page 19: Submitted By: Santosh Kumar Yadav (111432) M.E. Modular(2011) Under the Supervision of: Mrs. Shano Solanki Assistant Professor, C.S.E NITTTR, Chandigarh

Frequency Masking Curve• The greater the power in the masking tone the wider its

influence- then broader the range of frequency it can mask.• If two tones are widely separated in frequency, little masking

occurs.

19

Page 20: Submitted By: Santosh Kumar Yadav (111432) M.E. Modular(2011) Under the Supervision of: Mrs. Shano Solanki Assistant Professor, C.S.E NITTTR, Chandigarh

Multiple Frequency Tone Masking

20

Page 21: Submitted By: Santosh Kumar Yadav (111432) M.E. Modular(2011) Under the Supervision of: Mrs. Shano Solanki Assistant Professor, C.S.E NITTTR, Chandigarh

Critical Band• The critical band represents the ears resolving power for

simultaneous tones or partials.

21

Page 22: Submitted By: Santosh Kumar Yadav (111432) M.E. Modular(2011) Under the Supervision of: Mrs. Shano Solanki Assistant Professor, C.S.E NITTTR, Chandigarh

Bark Unit• Critical band unit given by Heinrich Barkhausen.

22

Page 23: Submitted By: Santosh Kumar Yadav (111432) M.E. Modular(2011) Under the Supervision of: Mrs. Shano Solanki Assistant Professor, C.S.E NITTTR, Chandigarh

Temporal Masking

• The louder the test tone, the shorter the amount of time required before the test tone is audible once the masking tone is removed.

23

Page 24: Submitted By: Santosh Kumar Yadav (111432) M.E. Modular(2011) Under the Supervision of: Mrs. Shano Solanki Assistant Professor, C.S.E NITTTR, Chandigarh

Summary

• Basic of Audio Compression.• Types of Audio Compression.• Fundamentals of psychoacoustics.

24

Page 25: Submitted By: Santosh Kumar Yadav (111432) M.E. Modular(2011) Under the Supervision of: Mrs. Shano Solanki Assistant Professor, C.S.E NITTTR, Chandigarh

FAQ’s

• Why linear predictive coding is called linear?• What is the significance of equal-loudness curve?• How RLE can be applied on audio?• What is the role of noise generator and pulse generator in

Vocoder?

25

Page 26: Submitted By: Santosh Kumar Yadav (111432) M.E. Modular(2011) Under the Supervision of: Mrs. Shano Solanki Assistant Professor, C.S.E NITTTR, Chandigarh

Refrences

• Fundamental of Multimedia by Le & Drew.• http://www.cs.cf.ac.uk• http://www.cs.sfu.ca/CourseCentral/365.html

26

Page 27: Submitted By: Santosh Kumar Yadav (111432) M.E. Modular(2011) Under the Supervision of: Mrs. Shano Solanki Assistant Professor, C.S.E NITTTR, Chandigarh

Queries?

27

Page 28: Submitted By: Santosh Kumar Yadav (111432) M.E. Modular(2011) Under the Supervision of: Mrs. Shano Solanki Assistant Professor, C.S.E NITTTR, Chandigarh

Thank YouFor

Your Patience28