realtime recognition of orchestral instruments ichiro fujinaga mcgill university
TRANSCRIPT
Realtime Recognition of
Orchestral Instruments
Ichiro Fujinaga
McGill University
Overview
Introduction
Lazy learning (exemplar-based learning)• k-NN classifier• Genetic algorithm• Features
Results
Conclusions
Introduction
Realtime recognition of isolated monophonic orchestral instruments
Spectrum analysis by Miller Puckette’s fiddle~
Adaptive system based on a exemplar-based classifier and a genetic algorithm
Overall Architecture
Data Acquisition&
Data Analysis(fiddle)
Recognition
K-NN Classifier
Output
Instrument Name
Knowledge BaseFeature Vectors
Genetic AlgorithmK-NN Classifier
BestWeight Vector
Live micInput
Sound fileInput
Off-line
Exemplar-based learning• The exemplar-based learning model is based on the
idea that objects are categorized by their similarity to one or more stored examples
• There is much evidence from psychological studies to support exemplar-based categorization by humans
• This model differs both from rule-based or prototype-based (neural nets) models of concept formation in that it assumes no abstraction or generalizations of concepts
• This model can be implemented using k-nearest neighbor classifier and is further enhanced by application of a genetic algorithm
Exemplar-based categorization
Objects are categorized by their similarity to one or more stored examples
No abstraction or generalizations, unlike rule-based or prototype-based models of concept formation
Can be implemented using k-nearest neighbor classifier
Slow and large storage requirements?
Exemplar-based learning• The exemplar-based learning model is based on the
idea that objects are categorized by their similarity to one or more stored examples
• There is much evidence from psychological studies to support exemplar-based categorization by humans
• This model differs both from rule-based or prototype-based (neural nets) models of concept formation in that it assumes no abstraction or generalizations of concepts
• This model can be implemented using k-nearest neighbor classifier and is further enhanced by application of a genetic algorithm
K-nearest-neighbor classifier
Determine the class of a given sample by its feature vector:
• Distances between feature vectors of an unclassified sample and previously classified samples are calculated
• The class represented by the majority of k-nearest neighbors is then assigned to the unclassified sample
Example of k-NN classifier
Classification of atheletes by height and weight(Rikishi sumo wrestlers and NBA basketball players)
170
180
190
200
210
75 100 125 150 175 200Weight (kg)
Sumo
Chicago Bulls
Example of k-NN classifier
Classification of atheletes by height and weight(Rikishi sumo wrestlers and NBA basketball players)
170
180
190
200
210
75 100 125 150 175 200Weight (kg)
Sumo
Chicago Bulls
Michael Jordan
Example of k-NN classifier
Classification of atheletes by height and weight(Rikishi sumo wrestlers and NBA basketball players)
170
180
190
200
210
75 100 125 150 175 200Weight (kg)
SumoChicago BullsMichael JordanDavid Wesley
Example of k-NN classifier
Classification of atheletes by height and weight(Rikishi sumo wrestlers and NBA basketball players)
180
185
190
195
200
0 50 100 150 200 250 300 350 400 450 500 550 600 650 700Weight (kg)
SumoChicago BullsMichael JordanDavid Wesley
Distance measures
The distance in a N-dimensional feature space between two vectors X and Y can be defined as:
A weighted distance can be defined as:
d = xi −yii=0
N−1
∑
d = wii=0
N−1
∑ xi −yi
Genetic algorithms
Optimization based on biological evolution
Maintenance of population using selection, crossover, and mutation
Chromosomes = weight vectors
Fitness function = recognition rate
Leave-one-out cross validation
Features
Static features (per window)• pitch• mass or the integral of the curve (zeroth-order moment)• centroid (first-order moment)• variance (second-order central moment)• skewness (third-order central moment)• amplitudes of the harmonic partials• number of strong harmonic partials• spectral irregularity• tristimulus
Dynamic features• means and velocities of static features over time
Data
Original source: McGill Master Samples
Over 1300 notes from 39 different timbres (23 orchestral instruments)
Spectrum analysis by fiddle (2048 points)
First 46–232ms of attack (1–9 windows)
Each analysis window (46 ms) consists of a list of amplitudes and frequencies of the peaks in the spectra
Results
Recognition rate
81
50
64
70
10098
88
96
40
50
60
70
80
90
100
Exp I Exp II Exp III
3 instr7 instr39 sintr
Experiment I• SHARC data• static features
Experiment II• fiddle• dynamic features
Experiment III• more features• redefinition of
attack point
Conclusions
Realtime timbre recognition system
Analysis by Puckette’s fiddle
Recognition using dynamic features
Adaptive recognizer by k-NN classifier enhanced with genetic algorithm
A successful implementation of exemplar-based classifier in a time-critical environment
Future research
Performer identification
Speaker identification
Tone-quality analysis
Multi-instrument recognition
Expert recognition of timbre
Recognition rate for different lengths of analysis window
20
30
40
50
60
70
80
90
100
1 2 3 4 5 6 7 8 9
3 instr7 instr39 instr
Comparison with Human Performance