3. audio technology - copyright © denis hamelin - ryerson university audio technology

40
3. Audio Technology - Copyright © Denis Hamelin - Ryerson University Audio Technology

Post on 21-Dec-2015

228 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 3. Audio Technology - Copyright © Denis Hamelin - Ryerson University Audio Technology

3. Audio Technology - Copyright © Denis Hamelin - Ryerson University

Audio Technology

Page 2: 3. Audio Technology - Copyright © Denis Hamelin - Ryerson University Audio Technology

3. Audio Technology - Copyright © Denis Hamelin - Ryerson University

What is sound?Sound is a physical phenomenon caused by vibration of material (ex.: violin string).

The vibration triggers pressure wave fluctuations in the air around the material.

The pressure waves propagate in the air.

We hear the sound when the wave reaches our eardrums.

Page 3: 3. Audio Technology - Copyright © Denis Hamelin - Ryerson University Audio Technology

3. Audio Technology - Copyright © Denis Hamelin - Ryerson University

What is sound?The wave form occurs repeatedly at regular intervals or periods.

Sound waves have a natural origin, so they are never absolutely uniform or periodic.

A sound with a recognizable periodicity is called music. It includes singing.

Non-periodic sounds can be called noises.

Page 4: 3. Audio Technology - Copyright © Denis Hamelin - Ryerson University Audio Technology

3. Audio Technology - Copyright © Denis Hamelin - Ryerson University

Sound WavesWithout air there is no sound. For example in space.

Sound has wave-like behaviour like reflection, refraction and diffraction. This makes the design of “surround sound” possible.

Page 5: 3. Audio Technology - Copyright © Denis Hamelin - Ryerson University Audio Technology

3. Audio Technology - Copyright © Denis Hamelin - Ryerson University

Pressure Wave Oscillation

period

amplitude

time

Airpressure

Page 6: 3. Audio Technology - Copyright © Denis Hamelin - Ryerson University Audio Technology

3. Audio Technology - Copyright © Denis Hamelin - Ryerson University

FrequencyA sound frequency is the reciprocal value of its period.

The frequency represents the number of periods per seconds and is measured in hertz (Hz).

A kHz describes 1000 oscillations per second or 1000 Hz.

Page 7: 3. Audio Technology - Copyright © Denis Hamelin - Ryerson University Audio Technology

3. Audio Technology - Copyright © Denis Hamelin - Ryerson University

Frequency RangesInfrasonic: 0 to 20Hz

Audiosonic: 20Hz to 20kHz

Ultrasonic: 20kHz to 1GHz

Hypersonic: 1GHz to 10 Thz

In multimedia we are concerned with sounds in the audiosonic range.

Page 8: 3. Audio Technology - Copyright © Denis Hamelin - Ryerson University Audio Technology

3. Audio Technology - Copyright © Denis Hamelin - Ryerson University

Wave LengthThe wave length is the length of one wave period. It is the reverse of the frequency.

A sound with a 20Hz frequency has a wave length of 17 meters.

A sound with a frequency of 20kHz has a wave length of 1.8 centimeters.

Page 9: 3. Audio Technology - Copyright © Denis Hamelin - Ryerson University Audio Technology

3. Audio Technology - Copyright © Denis Hamelin - Ryerson University

AmplitudeA sound has a property called amplitude, which humans perceive subjectively as loudness or volume. Measured in decibels (db).

The amplitude of a sound is a measuring unit used to deviate the pressure wave from its main value.

0 db - no sound 110 db – front row at rock concert

20 db - rustling of paper 130 db – pain threshold

35 db - quiet home 160 db - instant perforation of eardrum

70 db - noisy street

100 db - iPod at full volume

Page 10: 3. Audio Technology - Copyright © Denis Hamelin - Ryerson University Audio Technology

3. Audio Technology - Copyright © Denis Hamelin - Ryerson University

Sound PerceptionSound enters the ear canal.

At the eardrum, sound energy (air pressure changes) are transformed into mechanical energy (eardrum vibrates).

The outer ear helps us to locate the source of the sound by the relative intensity differences between the two ears.

The inner ear transforms the sound into impulses sent to the brain.

Page 11: 3. Audio Technology - Copyright © Denis Hamelin - Ryerson University Audio Technology

3. Audio Technology - Copyright © Denis Hamelin - Ryerson University

Sound Perception

Page 12: 3. Audio Technology - Copyright © Denis Hamelin - Ryerson University Audio Technology

3. Audio Technology - Copyright © Denis Hamelin - Ryerson University

Frequency PerceptionHumans have different perception abilities with different frequencies.

It is easier to perceive midrange frequencies than the very high and very low frequencies.

Sometimes, a loud sound will mask a softer one especially if the sound's two frequencies are in the similar range.

This will be important for sound compression.

Page 13: 3. Audio Technology - Copyright © Denis Hamelin - Ryerson University Audio Technology

3. Audio Technology - Copyright © Denis Hamelin - Ryerson University

Frequency PerceptionFor some frequencies, a sound can be physically softer and still be perceived as louder.

Page 14: 3. Audio Technology - Copyright © Denis Hamelin - Ryerson University Audio Technology

3. Audio Technology - Copyright © Denis Hamelin - Ryerson University

Audio Representation on Computers

The computer has to measure the waves amplitude in regular time intervals.

It then generates a series of sampling values (samples).

The process is called digitization by an analog-to-digital converter (ADC).

A digital-to-analog converter (DAC) is used to achieve the opposite conversion.

Page 15: 3. Audio Technology - Copyright © Denis Hamelin - Ryerson University Audio Technology

3. Audio Technology - Copyright © Denis Hamelin - Ryerson University

Sampling Rate

The rate of sampling an analog signal is measured in Hz (number of samples per second).

The inverse of the sampling frequency is the sampling period or sampling interval, which is the time between samples.

For CD quality, we use 44100Hz. For DVDs it is 48000Hz.

Page 16: 3. Audio Technology - Copyright © Denis Hamelin - Ryerson University Audio Technology

3. Audio Technology - Copyright © Denis Hamelin - Ryerson University

Sampling Rate

samples

SampleHeight

Page 17: 3. Audio Technology - Copyright © Denis Hamelin - Ryerson University Audio Technology

3. Audio Technology - Copyright © Denis Hamelin - Ryerson University

Nyquist TheoremThe Nyquist Theorem, also known as the sampling theorem, is a principle that is followed in the digitization of analog signals.

For analog-to-digital conversion (ADC) to result in a faithful reproduction of the signal, the samples of the analog waveform must be taken frequently. The number of samples per second is called the sampling rate or sampling frequency.

Page 18: 3. Audio Technology - Copyright © Denis Hamelin - Ryerson University Audio Technology

3. Audio Technology - Copyright © Denis Hamelin - Ryerson University

Nyquist TheoremAny analog signal consists of components at various frequencies. The simplest case is the sine wave, in which all the signal energy is concentrated at one frequency.

In practice, analog signals usually have complex waveforms, with components at many frequencies. The highest frequency component in an analog signal determines the bandwidth of that signal. The higher the frequency, the greater the bandwidth, if all other factors are held constant.

Page 19: 3. Audio Technology - Copyright © Denis Hamelin - Ryerson University Audio Technology

3. Audio Technology - Copyright © Denis Hamelin - Ryerson University

Nyquist TheoremIf a signal f(t) is sampled at regular intervals of time and at a rate higher than twice the highest significant signal frequency, then the samples contain all the information of the original signal.

Digitally sampled audio has a bandwidth of (20 Hz - 20 KHz). By sampling at twice the maximum frequency (40 KHz) we could have achieved good audio quality. CD audio slightly exceeds this, resulting in an ability to represent a bandwidth of around 22050 Hz. (hence the 44100Hz sampling rate for CDs)

Page 20: 3. Audio Technology - Copyright © Denis Hamelin - Ryerson University Audio Technology

3. Audio Technology - Copyright © Denis Hamelin - Ryerson University

QuantizationAfter sampling, sound signals are represented by one of a fixed number of values, in a process known as pulse-code modulation (PCM).

Pulse-code modulation (PCM) is a digital representation of an analog signal where the magnitude of the signal is sampled regularly at uniform intervals, then quantized to a series of symbols in a numeric (usually binary) code.

Page 21: 3. Audio Technology - Copyright © Denis Hamelin - Ryerson University Audio Technology

3. Audio Technology - Copyright © Denis Hamelin - Ryerson University

QuantizationQuantization depends on the number of bits used in measuring the height of the wave form.

16 bit CD quality quantization results in 64K values. 8 bits quantization has only 256 (telephone quality).

Example with 8 levels of quantization (3 bits):

Page 22: 3. Audio Technology - Copyright © Denis Hamelin - Ryerson University Audio Technology

3. Audio Technology - Copyright © Denis Hamelin - Ryerson University

Noiseless ChannelsNyquist proved that if any arbitrary signal has been run through a low pass filter of bandwidth H, the filtered signal can be completely reconstructed by making only 2H (exact) samples per second. If the signal consists of V discrete levels, Nyquist’s theorem states:

max-data-rate = 2H log2 V bits /sec A noiseless 3kHz channel with quantization level 1 bit cannot transmit binary signal at a rate exceeding 6000 bits per second (2 * 3000 * log2 2).

Page 23: 3. Audio Technology - Copyright © Denis Hamelin - Ryerson University Audio Technology

3. Audio Technology - Copyright © Denis Hamelin - Ryerson University

Noiseless ChannelsWe need to send a data rate of 256 kbps over a noiseless channel with a bandwidth of 20 kHz. How many signal levels (quantization) do we need?

max-data-rate = 2H log2 V bits /sec 256000 = 2 * 20000 * log2 Vlog2 V = 6.4Since 6.4 is not a an integer, we will need to decrease to 6 bits (64 quantization levels) or increase to 7 bits (128 levels) for bit rates respectively of 240,000 and 280,000. The choice depends on the transmission media.

Page 24: 3. Audio Technology - Copyright © Denis Hamelin - Ryerson University Audio Technology

3. Audio Technology - Copyright © Denis Hamelin - Ryerson University

Noisy ChannelsThermal noise present is measured by the ratio of the signal power S to the noise power N (signal-to-noise ratio S/N).

C = H log2(1+S/N)

dB = 10*log(value1/value2)

The capacity of the voice band of a telephone channel can be determined using the Gaussian model. The bandwidth is 3000 Hz and the signal to noise ratio is often 30 dB. Therefore,C = 3000 log2(1+1000) => 29902 bps approx.

Page 25: 3. Audio Technology - Copyright © Denis Hamelin - Ryerson University Audio Technology

3. Audio Technology - Copyright © Denis Hamelin - Ryerson University

Audio FormatsAudio formats are described by sample rate and quantization:

Voice quality - 8 bit quantization, 8000Hz mono (8 KBytes/sec)

Radio Quality - 22kHz 8-bit mono (22kBytes/s) and stereo (44 KBytes/sec)

CD quality - 16 bit quantization, 44100Hz linear stereo (196 KBytes/s)

Page 26: 3. Audio Technology - Copyright © Denis Hamelin - Ryerson University Audio Technology

3. Audio Technology - Copyright © Denis Hamelin - Ryerson University

Audio Formatsmu-law encoding corresponds to CCITT G.711 - standard for voice data in telephone companies in USA, Canada, Japan

A-law encoding - used for telephony elsewhere.

A-law and mu-law are sampled at 8000 samples/second with precision of 12-bits, compressed to 8-bit samples.

Page 27: 3. Audio Technology - Copyright © Denis Hamelin - Ryerson University Audio Technology

3. Audio Technology - Copyright © Denis Hamelin - Ryerson University

Audio Formatsmu-law and A-law: 8-bit precision.

PCM can be stored at various precisions, 16-bit PCM is common.

Multiple channels of audio may be interleaved at sample boundaries.

au (Sun/Next), wav (Microsoft RIFF/waveform format), aiff (Apple), RealAudio, mp3.

Page 28: 3. Audio Technology - Copyright © Denis Hamelin - Ryerson University Audio Technology

3. Audio Technology - Copyright © Denis Hamelin - Ryerson University

Audio Quality

Page 29: 3. Audio Technology - Copyright © Denis Hamelin - Ryerson University Audio Technology

3. Audio Technology - Copyright © Denis Hamelin - Ryerson University

Popular Sampling Rates

Page 30: 3. Audio Technology - Copyright © Denis Hamelin - Ryerson University Audio Technology

3. Audio Technology - Copyright © Denis Hamelin - Ryerson University

Coding Methods

Page 31: 3. Audio Technology - Copyright © Denis Hamelin - Ryerson University Audio Technology

3. Audio Technology - Copyright © Denis Hamelin - Ryerson University

3D Sound ProjectionThe shortest path between the sound source and the auditor is called the direct sound path.

All other sound paths are reflected which means they are temporarily delayed before they reach the auditor's ear.

Page 32: 3. Audio Technology - Copyright © Denis Hamelin - Ryerson University Audio Technology

3. Audio Technology - Copyright © Denis Hamelin - Ryerson University

Music and MIDIThe MIDI standard defines how to code the all the elements of musical scores, such as sequence of notes, timing conditions and the instrument to play each note.

MIDI is a standard that manufacturers of musical instruments use so that instruments can communicate musical information via computers.

Page 33: 3. Audio Technology - Copyright © Denis Hamelin - Ryerson University Audio Technology

3. Audio Technology - Copyright © Denis Hamelin - Ryerson University

Music and MIDIMIDI does not transmit an audio signal or media. It simply transmits digital data "event messages" such as the pitch and intensity of musical notes to play, control signals for parameters such as volume, vibrato and panning, cues and clock signals to set the tempo.

Because the music is simply data and not actually recorded wave forms, it is therefore maintained in a small file format.

Page 34: 3. Audio Technology - Copyright © Denis Hamelin - Ryerson University Audio Technology

3. Audio Technology - Copyright © Denis Hamelin - Ryerson University

MIDI InterfaceHardware - specifies a MIDI port (plugs into computers serial port) and a MIDI cable.

Data format - has instrument specification, notion of beginning and end of note, frequency and sound volume. Data grouped into MIDI messages that specify a musical event.

An instrument that satisfies both is a MIDI device (e.g. synthesizer).

Page 35: 3. Audio Technology - Copyright © Denis Hamelin - Ryerson University Audio Technology

3. Audio Technology - Copyright © Denis Hamelin - Ryerson University

MIDI File FormatsStandard MIDI File (SMF) Format: MIDI files are typically created using computer-based sequencing software (or sometimes a hardware-based MIDI instrument or workstation) that organizes MIDI messages into one or more parallel "tracks" for independent recording and editing. In most sequencers, each track is assigned to a specific MIDI channel and/or a specific General MIDI instrument patch.

Page 36: 3. Audio Technology - Copyright © Denis Hamelin - Ryerson University Audio Technology

3. Audio Technology - Copyright © Denis Hamelin - Ryerson University

MIDI File FormatsMIDI Karaoke File (.KAR) Format:

MIDI-Karaoke (which uses the ".kar" file extension) files are an "unofficial" extension of MIDI files, used to add synchronized lyrics to standard MIDI files. SMF players play the music as they would a .mid file but do not display these lyrics unless they have specific support for .kar messages. These often display the lyrics synchronized with the music in "follow-the-bouncing-ball" fashion, essentially turning any PC into a karaoke machine.

Page 37: 3. Audio Technology - Copyright © Denis Hamelin - Ryerson University Audio Technology

3. Audio Technology - Copyright © Denis Hamelin - Ryerson University

MIDI softwaremusic recording and performance applications, musical notations and printing applications, music education etc.The MIDI standard specifies 16 channels and identifies 128 instruments.

Page 38: 3. Audio Technology - Copyright © Denis Hamelin - Ryerson University Audio Technology

3. Audio Technology - Copyright © Denis Hamelin - Ryerson University

MOD File Format

MOD is a computer file format used primarily to represent music, and was the first module file format. MOD files use the “.MOD” file extension, except on the Amiga where the original trackers instead use a “mod.” prefix scheme, e.g. “mod.echoing”. A MOD file contains a set of instruments in the form of samples, a number of patterns indicating how and when the samples are to be played, and a list of what patterns to play in what order.

Page 39: 3. Audio Technology - Copyright © Denis Hamelin - Ryerson University Audio Technology

3. Audio Technology - Copyright © Denis Hamelin - Ryerson University

Human SpeechHuman ear is most sensitive in the range 600Hz to 6000 Hz.

Real-time signal generation allows transformation of text into speech without lengthy processing

Must be understandable, must sound natural

Speech transmission - coding, recognition and synthesis methods - achieve minimal data rate for a given quality.

Page 40: 3. Audio Technology - Copyright © Denis Hamelin - Ryerson University Audio Technology

3. Audio Technology - Copyright © Denis Hamelin - Ryerson University

End of lesson