lecture 2 - northwestern university

Bryan Pardo, 2008, Northwestern University EECS 352: Machine Perception of Music and Audio

Topic 7

Convolution, Filters, Correlation,

Representation


Short time Fourier Transform

•Break signal into windows

•Calculate DFT of each window


The Spectrogram

•A series of short term DFTs

•Typically just displays the magnitudes of X from

0 Hz to Nyquist rate

spectrogram(y,1024,512,1024,fs,'yaxis');

spectrogram(y,256,128,256,fs,'yaxis');


Equal Temperament

• Octave is a relationship by power of 2.

• There are 12 half-steps in an octave

122n

reff f

number of half-steps

from the reference pitch

frequency of the

reference pitch

frequency of

desired pitch


Spiral Pitch representation


Chroma: Many to one

• Chroma = log2(freq) – floor(log2(freq))

• Chroma periodic in range 0 to (almost) 1

• Chroma map on to pitch classes

CHROMA

0 50 100 150 200 250 300 3500

100

200

300

400

500

600

700

800

900

time

freq

uen

cy

0.25

0.0

0.5

0.75

800 Hz

400 Hz

200 Hz

100 Hz


Making a Chromagram

• Decide how to quantize (bin) the chroma range.

– 12 pitch classes? 120 bins? Equal temperment?

• Make a spectrogram

• For each time-step in the spectrogram

– find the chroma for each frequency from 0 to N/2

– Sum the amplitude of all frequencies with the same

chroma bin

• (Some chromagrams also add in the energy from the odd

harmonics)

– Place that value in the chroma bin


Chromagram of Clarinet

100 200 300 400 500 600 700 800 900

2

4

6

8

10

12

C

C#

D#

E

F

F#

G

G#

A

A#

B

D


Chromagram of Clarinet


Convolution

• convolution is a mathematical operator which takes

two functions f and g and produces a third function

that represents the amount of overlap between f and

a reversed and translated version of g.

• Typically, one of the functions is taken to be a fixed

filter impulse response, and is known as a kernel.

( )( ) ( ) ( )b

af g t f g t d

Convolution

operator


Discrete Convolution

• convolution is a mathematical operator which takes

two functions f and g and produces a third function

that represents the amount of overlap between f and

a reversed and translated version of g.

• Typically, one of the functions is taken to be a fixed

filter impulse response, and is known as a kernel.

( )[ ] [ ] [ ]n

f g m f n g m n


Convolution (Time Domain)

function C= convolution(A,B)

lengthA= length(A);

lengthB= length(B);

C = zeros(1, lengthA + lengthB - 1);

for m = 1:lengthA

for n = 1:lengthB

C(m+n-1) = C(m+n-1) + A(m)*B(n);

end

end

TIMES! Not Convolution


Convolution (Frequency Domain)

( ) ( )* ( )

( ) ( ) ( )

( ) ( ) ( )

y t x t h t

y t x h t d

Y X H

Convolution…

is defined in the

time domain as…

and in frequency

domain as…..


Speeding implementation

• How long does convolution take?

• Think about these factsFFT(x) takes O(nlog(n)) time

• Knowing this, how can we speed convolution?

( ) ( )* ( )

( ) ( ) ( )

y t x t h t

Y X H


Convolution = Filtering

(in the time domain)

Source Signalx(t)

Filterh(t)

Output Signaly(t)

)()(*)( tythtx Convolution


So what?

• So now you can design a quick-n-dirty bandpass filter H(k)!

• Steps

– Decide on the frequency resolution (number of points)

– Set the amplitudes of the different frequencies as you like

– Take the inverse FFT of your filter

– Do convolution with your signal

• Let’s try it!


Impulse response

• An “impulse” is this signal:

• The impulse response h(t) of a filter is what happens when you do convolution of the impulse signal with the filter.

• The impulse response h(t) of a filter is the same as the IFFT of the transfer function H() of that filter.

1 if 0( )

0 else

tt


Multiplication = Filtering

(in the frequency domain)

Source Signal

X()Filter

H()

Output SignalY()

( ) ( ) ( )X H Y

Multiplication Transfer function


Multiplication


• Multiply kth frequency of X by the kth

frequency of H

• This is COMPLEX multiplication

• Doing X.*H in Matlab takes care of this

( ) ( ) [ ] [ ]X H X k H k


Estimating H(k)


[ ] [ ] [ ]

so....

[ ] [ ] / [ ]

...more or less.

X k H k Y k

H k Y k X k


Noise and Distortion

Source Signal

X()Filter

H()

Output SignalY()

Distortion

D()

Noise

N()

+

Distortion and noise make it a lot harder to estimate H


Estimating H

• Hope noise and distortion are not

correlated with X.

• Call noise + distortion N.

• Then…

[ ] [ ] [ ] [ ]X k H k N k Y k


Estimating H

• Estimate H a lot of times, in the hopes

that the noise will wash out in the mix…

1

ˆ [ ] [ ] [ ] / [ ] [ ]

1[ ] [ ] [ ] [ ]

N

n n

n

H k Y k X k X k X k

where

Y k X k Y k X kN

The average value Over N estimates


Caveats

• This really only works on the frequencies

where there was energy in the input signal

X. If there wasn’t energy…well…then

we’re out of luck.

• So it is best to use broadband white noise

for X, if you can help it.

lecture 2 - northwestern university

Documents