ead15-01.theaudiosignal

8/11/2019 EAD15-01.TheAudioSignal

http://slidepdf.com/reader/full/ead15-01theaudiosignal 1/57

The Audio Signal

Elaborazione dell'audio digitaleIngegneria del Cinema, Informatica e Telecomunicazioni

Antonio Servetti [email protected]

Internet Media Group http://media.polito.it

Dip. di Automatica ed InformaticaPolitecnico di Torino

mailto:[email protected]

http://media.polito.it/







The Audio Signal

Outline

Reference: Lom bardo, "A udio e Mul t imedia", Ch. 1 and 2

Where an audio signal comes from? Waveform basics (sinusoids)

Audio objective attributes

Amplitude/intensity, frequency, duration Perceived ranges

Audio subjective features

From sinusoids to real sounds

Loudness, pitch, timbre

Production

Perception

AmplitudeFrequencyWveform

LoudnessPitch

Timbre

2



Audio signal - i.e. sound

Signal "Something that carries information with its variations in

time/space", can be manipulated, stored, transmitted

Sound

It is a mechanical wave caused by a vibrating object that

propagates in every direction in a medium (such as air or

water) through compression and rarefaction

And that can be detected by the human ear

Audio signal is the representation of that sound

The Audio Signal 3



Audio signal waveform

Representation of the pattern of changing airpressure that evolves with time

Characterized by amplitude, frequency (and phase)

The Audio Signal 4



Amplitude

The Audio Signal 5

Represents the intensity/energy of the sound at agiven point in time or space

Measured as sound pressure: the difference between

average local pressure and the pressure of the sound wave



Sound pressure level

The Audio Signal 6

Represents the sound energy level Measured using the root mean square (RMS) amplitude over

a time period (because amplitude has zero mean)

On a logaritmic scale (decibel, dB)

Human ear can detect sounds with a wide range of

amplitudes (from p0=2.5·10-6 N/m2 to 30 N/m2)

w.r.t. a reference level

Threshold of hearing at 1 kHz (10-6 N/m2)

Intensity is given by the square root of (rms)

pressure, so:



SPL reference table

Rubbing your hands in front of your nose isaround 65 dB SPL (calibration trick)

Useful sound levels

between 50-100 dB SPL

50: average home

60: conversational speech

100: disco music

The Audio Signal 7



The Audio Signal

Frequency (Hz)

Frequency: number of cycles per unit of time Related to the "altezza" (pitch) of a sound ("grave,acuto")

Perceived frequency range:

20-20'000 but maximum reduces with age (e.g. 16'000)

Below 20 Hz we perceive vibration with the body

Tuning fork

example

8

Audacity:

tuning_fork_a4.wav



The Audio Signal

Fundamental frequency

It is the lowest frequency in a sound Music instruments (e.g. piano)

DO4 (central) = 261.6 Hz

LA4 = 440 Hz – LA5 = 880 Hz (octave)

Lower note = 27.5 Hz – Higher note = 4180

Speech

Child speech ranges from 250-400 Hz, adult females tend

to speak at around 200 Hz on average and adult malesaround 125 Hz.

Singer

Soprano: DO4 – DO6 (1046.50 Hz), Tenore: DO3 – DO3

9



The Audio Signal

Speech, voice, music, audio, …

With respect to sound production we identify General audio: all the perceived sound

Speech, voice, music represent a subregion

frequency range / dynamic range

Audio:freq. range:

20-20’ 000 Hz

intensity range:

~ 100 dB

Telephone speech:

300-3400 Hz

~ 80 dB

Voice region

10



Some theory …

The Audio Signal 11



Base reference: sinusoid

Most basic signal = cos( + )

Angle as a function of time, given

A:amplitude, w0:radian frequency (2), :phase

The Audio Signal

A above middle C (LA 440 Hz)

= 10 cos(2 440 −

)

period: the shortest

time for the signal

to repeat itself

1/440 sec = 2.27 msec

12



Phase shift and time shift

Phase (together with frequency) determines thetime locations of the maxima and minima of a

cosine wave: = 0 ≔ = 0

Time shifting

= ( − )

t1 positive -> signal s(t) has been delayed

t1 negative -> signal s(t) has been advanced

Positive peak closest to t=0

Phase shifting to time shifting

cos + = cos(( − ) where = −

Phase shift is negative when time shift is positive

The Audio Signal

Reference:

Mc Clellan, "Signal Processing First", Ch2 Sinusoids

13



The Audio Signal

Phase and delay

For a single sound source phase values are not influent (it is just a delay)

But with multiple sound sources relative phase is important

(i.e. constructive or destructive effects, stereo image)

From phase to delay (and viceversa) as a function of thesignal frequency

Δt = ph / 2 PI f

(e.g. at 440 Hz,

ph = PI =>

Δt = 1.136 ms)

14



From theory to real sounds

The Audio Signal 15



The Audio Signal

Real sounds do not last forever

Real sounds are "transient" Last for a finite time span: come to life and then

extinguish

themselves

16

Audacity: tuning_fork_a4.wav

Audacity: trumpet_G4.wav



The Audio Signal

Transients: ADSR

Reference:

Time envelope Evolution of sound aplitude with time (positive peaks)

ADSR

Attack: initial run-up of level from nil to peak

Decay: subsequent run down to the designated sustain level

Sustain: level during the main sequence of the sound's duration

Release: level to decay from sustain level to zero

17



The Audio Signal

Transients: ADSR

Reference:

Musical instruments have different ADSR A rapid attack will tend to be heard as a percussive sound

A slow attack is more fitting for wind instruments

Note:

• Even experienced musicians may have difficulty identifyingthe source of a sound when its envelope is manipulated

18

l



Real sounds are not periodic

Quasi-periodic: reapeat (almost) identical aftersome (almost) constant time

A-periodic: no clear periodicity can be identified

(noise-like)

The Audio Signal

0.42 0.44 0.46 0.48 0.5 0.52 0.54

-0.15

-0.1

-0.05

0

0.05

0.1

0.15

0.2

La110Chitarra.wav

T0

19

exactly

castanets.wav

0.01 0.012 0.014 0.016 0.018 0.02 0.022

-0.15

-0.1

-0.05

0

0.05

0.1

0.15

0.2



Quasi-periodic signals

Complex sounds with multiple frequencycomponents

There is no single frequency

Fundamental f. (F0): signal period (lowest f.)

Harmonics (Fn):integer multiplies of F0

(other peaks in the

w. cycle)

The Audio Signal

0.42 0.44 0.46 0.48 0.5 0.52 0.54

-0.15

-0.1

-0.05

0

0.05

0.1

0.15

0.2

La110Chitarra.wav

T0

20



The Audio Signal

Complex sounds

Reference:

Complex sounds can be approximated by sinewaves with different amplitude, frequency and

phase

Fundamental frequency:

155 Hz

2nd harmonic with

aplitude 1/7 and phase +75°

3rd harmonic with

amplitude 1/3.5and phase +250°

21



The Audio Signal

Complex sounds (example)

Reference:

Three instruments Flute

Oboe

Violin

Playing the same

note

Have different

frequency content

22



The Audio Signal

Timbre (audio demo)

Reference:

Demonstration 29. Effect of Tone Envelope on Timbre (2:16)

You will hear a recording of a Bach chorale played on a piano.

Now the same chorale played backwards

Now the tape of the last recording is played backwards so that thechorale is heard forwards

The purpose of this demonstration (originally presented by J. Fassett) is to show thatthe temporal envelope of a tone, i.e. the time course of the tone's amplitude, has asignificant influence on the perceived timbre of the tone. By removing the attacksegment of an instrument's sound, or by substituting the attack segment of anothermusical instrument, the perceived timbre of the tone may change so drastically thatthe instrument is no longer recognizable.

In this demonstration, a four-part chorale by J.S. Bach ("Als der gutige Gott) is playedon a piano and recorded on tape. Next the chorale is played backward on the pianofrom end to beginning, and recorded again. Finally the tape recording c .backwardchorale is played in reverse, yielding the original (forward) chorale, except that eachnote is reversed in time. The instrument does not sound like a piano any more, butrather resembles a kind of reed organ. The power spectrum of each measured over thenote's duration, is not changed by temporal reversal of the tone.

24

D29C.EffectOfToneEnvelopeOnTimbre.ChoralePlayedBackwardReversed



From math to perception

Audio technology is heavily related to audioperception because the human hearing system

and brain are involved

Psychoacoustics: how physical measures are

related to audio perception?

The Audio Signal 25



From physics to biology

Audio technology is heavily related to

audio perception because the human

hearing system and brain are involved

Psychoacoustics: how physical measures

are related to audio perception?

The Audio Signal 26



The Audio Signal

The human ear

Get slides (and videos) from Audio Coding

27

Reference:



Loudness

Psychological correlate of sound amplitude That attribute of auditory sensation in terms of which sounds

can be ordered on a scale extending from quiet to loud

Loudness level (phon)

Correlated to intensity log But not uniform in f.

Affected by frequency

bandwidth and duration

Perceived loudness (sone) Phones scale with level in dB,

not with loudness

The Audio Signal

Reference:

http://www.sengpielaudio.com/calculatorSonephon.htm (+10 dB)

28

Reference:

http://www.sengpielaudio.com/calculatorSonephon.htm






The Audio Signal

Frequency (Hz)

Reference:

Frequency: number of cycles per unit of time Perceived frequency range:

20-20'000 but maximum reduces with age (e.g. 16'000)

Perception:

Pitch perception of the ear is proportional to the

logarithm of frequency rather than to frequency itself

Example: 'sine_sweep.mp3'

Reference:

Lombardo, "Audio e Multimedia", Ch.1 Acustica

29

Reference:



The Audio Signal

Pitch

Pitch is a perceptual property that allows theordering of sounds on a frequency-related scale

Human perception of pitch is approximately logarithmic

with respect to fundamental frequency

Pitch is an auditory sensation Pure tones maps to frequency

Complex tones is ambiguous

Labeling (scientific pitch notation)

Note + octave

(es. C0 16 Hz, C4 261Hz, A4 440Hz)


30

Reference:



Pitch of harmonic sounds

Harmonics sounds approximated by sine waveswith different amplitude, frequency and phase

Fundamental frequency: 155 Hz

2nd harmonic with aplitude 1/7 and phase +75°

3rd harmonic withamplitude 1/3.5

and phase +250°

The Audio Signal


31



The Audio Signal

Complex sounds (audio demo)

Reference:

Cancelled Harmonics A complex tone is presented followed by several

cancellations and restorations of a particular harmonic.

This is done for harmonics 1 through 10.

This demonstration illustrates Fourier analysis of a complex tone

consisting of 20 harmonics of a 200-Hz fundamental.

When we listen analytically, we hear the different components

separately; when we listen holistically, we focus on the whole sound and

pay little or no attention to the components.

When the relative amplitudes of all 20 harmonics remain steady (even if

the total intensity changes), we tend to hear them holistically.However, when one of the

harmonics is turned off and on,

it stands out clearly

32

01_cancelled_harmonics

Reference:



Virtual pitch

When there is no discernible fundamental, the earwill often create one

1st Individually partials sound like high-pitched sinusoids

2nd Together create the percept of a single sound at lower f.

The Audio Signal

Sethares, "Tuning, Timbre, Spectrum, Scale", Ch2 The Science of Sound

33

Reference: Watkinson, “The art of digital audio”, Ch.2



The Audio Signal

Sound identification

Reference: Watkinson, The art of digital audio , Ch.2

Location and size Time domain response

works quickly and is

older in evolutionary

terms (< 1ms)

Pitch and timbre Frequency domain response

works more slowly, evolved

later presumably after speech

evolved (> 10-30 ms)

34



The Audio Signal

Listening examples

Courtesy of www.audiocheck.net Calibration

testtones_hearingtestaudiogram.php Audiograms require a properly calibrated audio system. As we have no idea how loud your

sound level has been turned to as you listen to our sound files, running an online audiogram

test requires a trick (as imprecise as it is)

First, we need you to adjust your computer's level to match a known reference. Here is thetrick: rub your hands together, in front of your nose, quickly and firmly, and try producing the

same sound as our calibration file. You are now generating a reference sound that is

approximately 65 dBSPL.

High frequency range test (8-22 kHz)

audiotests_frequencycheckhigh.php A -9 dbFS sweeping sine tone, from 22 kHz (supposedly inaudible) down to 8 kHz (if you can't

hear this one, consider checking your hearing). On the top of the test tone, a voiceover tells you

which frequency is currently playing.

Play back the file until you start hearing the underlying high pitch tone as it descends. The

voiceover tells you the frequency you have reached. This frequency more or less represents the

upper limit of your audio system, or your hearing.

35

http://www.audiocheck.net/

http://www.audiocheck.net/testtones_hearingtestaudiogram.php

http://www.audiocheck.net/audiotests_frequencycheckhigh.php

http://www.audiocheck.net/audiotests_frequencycheckhigh.php

http://www.audiocheck.net/testtones_hearingtestaudiogram.php




The Audio Signal

Listening examples (cont)

Courtesy of www.audiocheck.net Dynamic range

http://www.audiocheck.net/audiotests_dynamiccheck.phpDynamic range represents the ratio between the loudest signal you can hear and the quietest.

Dynamic range is expressed in terms of decibels (dB). Being a ratio, the decibel has no units;

everything is relative. Since it is relative, it must be relative to some reference point that has to

be defined. Our reference point here is the loudest level you can comfortably bear for onesecond. This test helps you benchmark the dynamic range of your sound system.

Interestingly, much emphasis is put on 24-bit audio recordings nowadays, with a dynamic range

exceeding 140dB. Our example is only 16-bit, with a maximum dynamic range of 96dB, yet that

should be plenty. Judge for yourself.

36





Sound sources localization

The Audio Signal 37



The Audio Signal 38

Localizzazione sorgenti sonore

Obiettivo: costruzione di una mappa sonora deglioggetti intorno a noi

Primo uso dell'udito dal punto di vista evolutivo

Posizionamento su tre direzioni principali

Fronte-retro:piano frontale

Sinistra-destra:piano mediano(eq. dist. orecchie)

Sopra-sotto:piano orizzontale(giacciono orecchie)

Fig. 3.20



The Audio Signal 39

Posizione sorgente sonora

Espressa tramite un vettore caratterizzato da 2angoli

Azimut (0° fronte – 180° retro)

•Angolo tra proiezione sul piano orizzontale e vettore che segue ladirezione fronte-retro

Elevazione (-90° sotto, 90° sopra)

•Angolo tra il vettoreed il piano orizzontale

E da uno scalare

Distanza



The Audio Signal 40

Ascolto direzionale

Sono stati individuati due meccanismi chedescrivono entrambi la differenza tra i suoni alledue orecchie

ITD – Interaural time difference (tempo o fase)

IID – Interaural intensity difference (intensità o ampiezza)

Fig. 3.23



The Audio Signal 41

Interaural Time Difference

Viene rilevata quando una sorgente non si trovaesattamente sul piano mediano

La distanza percorsa dal suono per giungere all'orecchio “opposto” è maggiore e quindi il suono arriva in ritardo

Si riesce a raggiungere

la precisione di un grado(sx/dx) e la minima ITDrilevabile è 0,6 msec

Fino a 1000 Hz quandolunghezza d'ondacomparabile condistanza tra orecchie



The Audio Signal 42

Interaural Intensity Difference

Si definisce come differenza di ampiezza o dispettro poichè ad una delle due orecchie nonarrivano tutte le frequenze del suono

Che vengono filtrate dalla testa

Le alte frequenze (> 1500 Hz) vengono riflesse Le basse frequenze subiscono diffrazione e girano intornoall'ascoltatore



The Audio Signal 43

Head Related Transfer Function

Funzione di trasferimento in relazione alla testa Descrive tutti i cambiamenti che occorrono alle nostreorecchie rispetto alla forma d'onda in fase ed ampiezza

E' misurata tramite appositi microfoni posizionatinell'orecchio di manichini

Sono difficili da generalizzare• La HRTF di tizio male si applica alla percezione di caio

Anche il padiglione auricolare “filtra” il segnale

Le pieghe permettono di percepire l'elevazione di unasorgente sonora

Il padiglione la provenienza nella direzione davanti/dietro



The Audio Signal 44

Effetto di precedenza

In presenza di due (o più) sorgenti sonore in posizionidiverse, viene percepita una direzione che corrisponde,

Sotto la curva di intensità sonora, all'incirca alla prima sorgente chearriva alle orecchie (effetto Haas)

Sopra la curva, la sorgente sonora è localizzata verso il suono più

forte Dopo un ritardo

di 30 ms si iniziaa percepire l'eco



The Audio Signal 45

Posizionamento altoparlanti

Gli altoparlanti sono posti ai vertici di un triangoloequilatero rispetto all'ascoltatore

Pena una minore stabilità nel posizionamento delle sorgentisonore

• Se troppo distanti,

come al cinema,è facile percepire un buconella parte centraletra i due altoparlanti



The Audio Signal 46

Immagini sonore fantasma

Sono create in posizione intermedia tra i duealtoparlanti per mezzo delle differenze di intensitàquando la differenza di tempo è molto ridotta(0,05 < dt < 1,5 msec)

Invece di percepire due sorgenti sonore distinte la sorgente

risulterà posizionata verso l'altoparlante più forte (o al centrose di pari intensità)

Caveat

Se l'ascoltatore non si trova alla distanza corretta tra i due

altoparlanti la sorgente fantasma percepita non è quellavoluta

Le frequenze ammissibili non sono molte <700Hz,

• Al di sopra interferenza distanza orecchie e filtraggio testa



The Audio Signal 47

Pan Potting

Formulazione matematica

Alfa angolo percepito che distanzia lasorgente fantasma dal piano mediano

Beta angolo sotteso dai due altoparlantinella posizione dell'ascoltatore

(r.p.mediano)

Teta angolo che distanzia la sorgentesonora reale

sinα = − + sinβ

− + = tanθ



The Audio Signal 48

Audio binaurale

Trasposizione dei canali stereo convenzionali sullecuffie

Differenze

Solo il canale destro arriva all'orecchio d. e viceversa

Non ci sono mai differenze di tempo tra i segnali



The Audio Signal 49

Audio binaurale

Sintesi binaurale Per produrre reali effetti di audio 3D occorre il calcolo delleHRTF e conseguente modifica dello spettro in seguito alledifferenze misurate per sorgenti sonore “localizzate” (utilizzodi interpolazioni)

Head tracking Orientato adapplicazioni direaltà virtuale



The Audio Signal 50

Binaural Effects

The recording is then played back through headphones, sothat each channel is presented independently, withoutmixing or crosstalk. Thus, each of the listener's eardrums isdriven with a replica of the auditory signal it would haveexperienced at the recording location

Zeno, “Nature has given man one tongue, but two ears,that we may hear twice as much as we speak”

Binaural effects

Binaural Lateralization A.D. 37 (72,73,74)

An auditory illusion A.D. 39 (80)

http://www.feilding.net/sfuad/musi3012-01/demos/audio/



The Audio Signal 51

Binaural Lateralization

The most important benefit we derive from binauralhearing is the sense of localization of the sound source.

Low frequency sounds are lateralized mainly on the basisof interaural time difference, whereas high frequency soundsare lateralized mainly on the basis of interaural intensity

differences. Phase difference. Tones of 550 Hz and then 200 Hz areheard with alternating interaural phases of plus and minus45 degrees. At 500 Hz, the image switches from side to sideas the phase changes. At 2000 Hz, on the other hand, no

such movement is perceived. (250 us / 62 us).

D37.BinauralLateralization_PHASE



The Audio Signal 52

An auditory illusion

Tones of 400 and 800 Hz alternate in both ears in oppositephase; that is, when the left ear receives 400 Hz, the rightear receives 800 Hz. About 99% of listeners hear a singlelow-frequency tone in one ear and a high-frequency tone inthe other ear. Quite remarkably, when the headphones arereversed, most listeners hear the high tone and the low tonein the same ears as before.

D39.An_Auditory_Illusion



The Audio Signal

Bibliography

Lombardo, "Audio e Multimedia"

Ch.1 - Acustica

Ch.2

- Sez 2.3.1: I parametri della percezione

- Sez. 2.4: Localizzazione delle sorgenti sonore

Interesting readings

Mc Clellan, "Signal Processing First"

• Ch.2: Sinusoids

Sethares, "Tuning, Timbre, Spectrum, Scale"• Ch.2: The Science of Sound (parts)

53



The Audio Signal

Tools

Audacity, http://audacity.sourceforge.net/

54

http://audacity.sourceforge.net/





The Audio Signal

Audio samples

AES Auditory Demonstrations

http://www.feilding.net/sfuad/musi3012-01/demos/audio/

(unofficial link)

55



The Audio Signal

Source code

none

56

ead15-01.theaudiosignal

Documents