c enter for a uditory and a coustic r esearch representation of timbre in the auditory system shihab...

Post on 26-Dec-2015

217 Views

Category:

Documents

1 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Center for Auditoryand Acoustic Research

Representation of Timbre inthe Auditory System

Shihab A. Shamma

Center for Auditory and Acoustic ResearchInstitute for Systems Research

Electrical and Computer EngineeringUniversity of Maryland, College Park

Center for Auditoryand Acoustic Research

0 20 40 60 80 100 120 1400

1

2

3

4

5

6

7

8

9

0 20 40 60 80 100 120 1400

1

2

3

4

5

6

7Time (ms)

200 400 600 800 1000 1200 1400 1600 1800

125

250

500

1000

2000

Musical SpectrogramsViolin (vibrato) Piano

Time (ms)

Center for Auditoryand Acoustic Research

sound

Central AuditoryStages

CollicularStages

MidbrainNuclei

Early AuditoryStages

Attributes of Complex Sounds

NLL

LL

TB

Anatomy of the AuditorySystem

DCNPVCNAVCN

Location Timbre Pitch

Spatial maps

Computing pitch

Harmonic templates

ILD, ITDSpectral cues

The auditory spectrum

IC

MGB

Center for Auditoryand Acoustic Research

AnalysisCochlear filters

TransductionHair cells

ReductionLateral inhibition

log f

log f

log f

log f

log u

u log f

eardrum cochlea basilar membranefilters

hair cell stages lateral inhibitorynetwork

Time(ms)

100 200 300 400 500 600 700 800 900 1000

125

250

500

1000

2000

Audit ory Spec t rogram

Early Auditory Processing Stages

Center for Auditoryand Acoustic Research

4000

2000

1000

500

250

Time(ms) 60

average response

Auditory-Nerve ResponsePatterns to a Single Tone

Center for Auditoryand Acoustic Research

4000

2000

1000

500

250

Time(ms) 60

average response

Auditory-Nerve ResponsePatterns to Two-Tone Stimulus

Center for Auditoryand Acoustic Research

500

500

Time (ms)

Time (ms)

4000

4000

250

250

/ r i t a w a y /

Center for Auditoryand Acoustic Research

Sound

Estimated stimulus spectrum

60Time (msec)

Basilar membrane vibrations

Time (msec) 500

A’

B’

C’

Cochlear Analysis Auditory-Nerve Responses

C4

.25

Har

mon

ic s

erie

s

Time (msec)

4000

250

60

CF

(H

z)

4000

250

CF

(Hz)

Time (msec)

A

B

500

C

CF

(kH

z)

Hair cells along the tonotopic axis

Characteristic F

req uen cy Ax is (C

F)

Auditory-nerve fibers

Lateral Inhibition

Center for Auditoryand Acoustic Research

Time (ms)100 200 300 400 500 600 700 800 900 1000

125

250

500

1000

2000

Normal

Time(ms)100 200 300 400 500 600 700 800 900 1000

125

250

500

1000

2000

Time(ms)100 200 300 400 500 600 700 800 900 1000

125

250

500

1000

2000

Time (ms)100 200 300 400 500 600 700 800 900 1000

125

250

500

1000

2000

Time (ms)100 200 300 400 500 600 700 800 900 1000

125

250

500

1000

2000

Normal Down-Shift

Compress Dilate

Center for Auditoryand Acoustic Research

Center for Auditoryand Acoustic Research

Awake Set-up

Awake ferret with head restraint in cylindrical holder

Center for Auditoryand Acoustic Research

The raw neural trace typically contained multiple distinct waveforms(typically representing 1-4 neurons) which were sorted off-line.

0 20 40 60 80 100 120

0

Spike Sorting

2000

4000

8000

1600010e Unit 2

tagless 10e

2000

4000

8000

1600010e Unit 1

21

7770 14748

Time (ms) 50 100 150 Time (ms) 50 100 150

Waveforms were sorted in a semi-automatic procedure. First, aPCA-based algorithm was used to pre-sort the spikes. Then aMATLAB based program was used to refine the classification.

Center for Auditoryand Acoustic Research

0 500 1000 1500 2000 2500 3000 3500 4000 45000

0.5

1

1.5

2

2.5

3

3.5

4

Time (ms)100 200 300 400 500 600 700 800 900 1000

125

250

500

1000

2000

0 100 200 300 400 500 600 7000

0.5

1

1.5

2

2.5

3

Three envelopes ofmodulation:Slow (< 30 Hz)Intemediate (< 500 Hz)Fast (< 4 kHz)

/come/ /home/ /right/ /away/

Center for Auditoryand Acoustic Research

0

25005

t (ms)

Rate (Hz)

0.6

0.2

124-4-12

Time (ms)100

200 300 400 500 600

700

800

900 1000

125

250

500

1000

2000

Decomposing a Spectrogram into Dynamic Ripples

Frequency (kHz)

∆A

1 2 4 8 16

Tim

e (ms)

Frequency

w4 Hz

0

250

Center for Auditoryand Acoustic Research

4241682028Ω=0.4cyc/octω=4to32Hz 30sweepsperωTime(ms)

TemporalFrequency(Hz) RippleFrequencyis0.4cycles/oct1232 55dB

Reponses to Moving Ripples

Center for Auditoryand Acoustic Research

w(Hz) Ω= 0.8 cyc/oct

Time (ms)

w= 12 HzΩ (cyc/oct)

Time (ms)

| F { }|

|TF ( ,Ω )|

0

Ω

0 T

0

X

t (m s)

ST RF (t,x )

B

freq

uenc

y

-w w

w

A

Center for Auditoryand Acoustic Research

4

0.125

4

0.125

4

0.125

4

0.125

4

0.125

4

0.125

Examples of Different STRF Shapes

Center for Auditoryand Acoustic Research

Spectro-Temporal Response Fields

Center for Auditoryand Acoustic Research

250 8000.25

8

A

C

1 8 0

0.25

8

Multiscale Cortical Representation of a Spectrogram

Frequency

Rate (

Hz)

Center for Auditoryand Acoustic Research

Scale-Rate Decomposition

Reconstruction

Center for Auditoryand Acoustic Research

MUSICAL TIMBRE

Center for Auditoryand Acoustic Research

0 20 40 60 80 100 120 1400

1

2

3

4

5

6

7

8

9

0 20 40 60 80 100 120 1400

1

2

3

4

5

6

7Time (ms)

200 400 600 800 1000 1200 1400 1600 1800

125

250

500

1000

2000

Musical SpectrogramsViolin (vibrato) Piano

Time (ms)

Center for Auditoryand Acoustic Research

- 1.- 2.- 4.- 8.- 16.- 32.0.25

0.50

1.00

2.00

4.00

8.00

1. 2. 4. 8. 16. 32.0.25

0.50

1.00

2.00

4.00

8.00

- 1.- 2.- 4.- 8.- 16.- 32.0.25

0.50

1.00

2.00

4.00

8.00

1. 2. 4. 8. 16. 32.0.25

0.50

1.00

2.00

4.00

8.00

Rate (Hz)

_ + +_

1 2 4 8 16 32 - 1- 2- 4- 8- 16-32 1 2 4 8 16

32- 1- 2- 4- 8- 16-32

.25

.5

1

2

4

8Violin (vibrato) Piano

OboeClarinet

Patterns of Musical TimbreViolin (vibrato) Piano

OboeClarinet

Center for Auditoryand Acoustic Research

Timbre Metric for Some Musical Instruments (TSVQ)

Center for Auditoryand Acoustic Research

Timbre Metric for Musical Instruments

exp1c-model-rs-mf

2 4 6 8 10 12

2

4

6

8

10

12

exp1c-subjects

2 4 6 8 10 12

2

4

6

8

10

12

GuitarHarpViolin Pizz.Violin Bowed Bass Synth A Synth B Oboe ClarinetFlute HornTrumpet

Gui

tar

Har

pV

ioli

n P

izz.

Vio

lin

Bow

ed

Bas

s S

ynth

A

Syn

th B

O

boe

Cla

rine

tF

lute

H

orn

Tru

mpe

t

GuitarHarpViolin Pizz.Violin Bowed Bass Synth A Synth B Oboe ClarinetFlute HornTrumpet

Gui

tar

Har

pV

ioli

n P

izz.

Vio

lin

Bow

ed

Bas

s S

ynth

A

Syn

th B

O

boe

Cla

rine

tF

lute

H

orn

Tru

mpe

t

Subjects (1-24) Spectral cues

Temporal cues

Spectro-temporal cues

Center for Auditoryand Acoustic Research

Mapping musical instruments

Frequency (Hz)

Time (ms)200 400 600 800 1000 1200 1400

125

250

500

1000

2000

Frequency (Hz)

Time (ms)200 400 600 800 1000 1200 1400

125

250

500

1000

2000

Guitar Trumpet

Frequency (Hz)

Time (ms)200 400 600 800 1000 1200 1400 1600 1800 2000

125

250

500

1000

2000

Trumpar

Frequency (Hz)

Time (ms)200 400 600 800 1000 1200 1400 1600 1800 2000

250

500

1000

2000

4000

ACE Chord

- 1- 2- 4- 8- 16

0.50

1.00

2.00

4.00

8.00

1 2 4 8 16

0.50

1.00

2.00

4.00

8.00

- 1- 2- 4- 8- 16

0.50

1.00

2.00

4.00

8.00

1 2 4 8 16

0.50

1.00

2.00

4.00

8.00

A Melody with the Trumpar

Center for Auditoryand Acoustic Research

Speech Analysis&

Assessment of Inteligibility

Center for Auditoryand Acoustic Research

0 500 1000 1500 2000 2500 3000 3500 4000 45000

0.5

1

1.5

2

2.5

3

3.5

4

Time (ms)100 200 300 400 500 600 700 800 900 1000

125

250

500

1000

2000

0 100 200 300 400 500 600 7000

0.5

1

1.5

2

2.5

3

Three envelopes ofmodulation:Slow (< 30 Hz)Intemediate (< 500 Hz)Fast (< 4 kHz)

/come/ /home/ /right/ /away/

Center for Auditoryand Acoustic Research

Center for Auditoryand Acoustic Research

Human versus Ferret Sensitivity to Spectrotemporal Modulations

Center for Auditoryand Acoustic Research

Center for Auditoryand Acoustic Research

Center for Auditoryand Acoustic Research

Auditory Scene Analysis&

Pitch Extraction

Center for Auditoryand Acoustic Research

250 8000.25

8

A

C

1 8 0

0.25

8

Relevance to Auditory Scene Analysis: Streaming and grouping

Frequency

Rate (

Hz)

Working Hypotheses

Streaming: Any consistently isolated feature in the multiscale representation can be streamed e.g., spectral patterns (tones or average vocal tract spectra) repetitive temporal dynamics (modulated noise or sinusoidal FM tones) - transients as segmenters

Grouping: Harmonicity and its linearly interpolated extensions (pitch extraction and segregation, regular patterns) Shared dynamics (Common onsets and modulations)

Center for Auditoryand Acoustic Research

Frequency (Hz)

Time (ms)100 200 300 400 500 600 700 800 900 1000

250

500

1000

2000

4000

250 500 1000 2000 4000

0.5

1.0

2.0

4.0

8.0

0 20 40 60 80 100 120 14002468

10121416

250 500 1000 2000 4000

0.5

1.0

2.0

4.0

Cortical Representation of Harmonic & Shifted Spectra

Auditory Spectrum Multiscale Representation

Sca

le

Frequency

Reduced Representation

Sca

le

Shifted Spectra are also grouped although they are inharmonic

Center for Auditoryand Acoustic Research

Computing Pitch

Center for Auditoryand Acoustic Research

125

250

500

1000

2000

125

250

500

1000

2000

Pitch Estimates

Pre-cortical processing Post-cortical processing

Center for Auditoryand Acoustic Research

F em ale

10 time (s)

M ale

.125

2

10 time (s)

M ale+F emale

10 time (s)

B Extracted Female

10 time (s)

A

P itc h tracks

Estimating Pitch Extracting Pitch Streams

Center for Auditoryand Acoustic Research

Voice Morphing

Center for Auditoryand Acoustic Research

Manipulating Temporal and Spectral Modulations

Time(ms)100 200 300 400 500 600 700 800 900 1000

125

250

500

1000

2000

Time(ms)100 200 300 400 500 600 700 800 900 1000

125

250

500

1000

2000

Time(ms)100 200 300 400 500 600 700 800 900 1000

125

250

500

1000

2000

Time(ms)100 200 300 400 500 600 700 800 900 1000

125

250

500

1000

2000

Normal

Temporally smeared

Spectrally smeared

Temporally sharpened

Center for Auditoryand Acoustic Research

Time(ms)100 200 300 400 500 600 700 800 900 1000

125

250

500

1000

2000

Time (ms)500 1000 1500 2000 2500 3000

125

250

500

1000

2000

Time(ms)500 1000 1500 2000 2500 3000

125

250

500

1000

2000

Female

Oboe

Female Oboe

MorphingVoices

Center for Auditoryand Acoustic Research

Auditory Speech and Music ProcessingTai Chi, Mounya El-Hilali, Powen Ru

Cortical Physiology and Auditory ComputationsDidier Depireux, Jonathan Fritz, David KleinJonathan Simon

Acknowledgment

Supported by:MURI # N00014-97-1-0501 from the Office of Naval Research# NIDCD T32 DC00046-01 from the NIDCD# NSFD CD8803012 from the National Science Foundation

top related