outline - university of southern california
TRANSCRIPT
1
Outline
• Introduction
• Music Information Retrieval
• Classification Process Steps
• Pitch Histograms
• Multiple Pitch Detection Algorithm
• Musical Genre Classification
• Implementation
• Future Work
Why do we classify?
• Increasing importance of digital music distribution
• Effectively navigating through large web-based musiccollections
• Structuring on-line music stores & radio stations
• Creating intelligent Internet music search engines andPeer-to-Peer systems
• Can be used in other type of analysis like similarityretrieval or summarization
Audio Classification
Jazz
Rock
Classical
Country
Electronica
Reggae
WorldFolk New Age
?
?
? ? ?
??
?
?
?
?
?
?
2
Audio Classification (cont.) Audio Classification (cont.)
Music Information Retrieval (MIR)
The process of indexing and searching music collections.
• Symbolic MIR – Structured signals such as MIDI files are used.
– Melodic information is typically utilized.• Two different approaches: Query-by-melody (manual) and Query-by-humming
• Audio MIR – Arbitrary unstructured audio signals are used.
– Timbral and rhythmic (beat) information is utilized.
What is MIDI?
• Musical Instrument Digital Interface• A music definition language• Communication protocol• supports 128 different voices• includes 16 channels
3
Classification Process Steps
MIDI file Audio-from-MIDI file Arbitrary Audio file
Pitch Histogram
4D Feature Vector(Pitch Content Feature Set)
Multiple Pitch Detection Algorithm
Labeled Feature Vectorsused by Statistical Classifiers
Histogram Construction Algorithm
Timbral & Rhythmic Features
Genre Classification Result by comparing the feature vectors
Pitch Histograms
• Unfolded Histogram– an array of 128 integer values (bins) indexed by MIDI note numbers
– showing the frequency of occurrence of each note in a musical piece
– contains information regarding the pitch range of the music
• Folded Histogram– All notes are transposed into a single octave and mapped to a circle of
fifths
– an array of 12 integer values
– contains information regarding the pitch content of the music
Folded Pitch Histogram – Index Numbers
127126125124123122121120
119118117116115114113112111110109108
10710610510410310210110099989796
959493929190898887868584
838281807978777675747372
717069686766656463626160
595857565554535251504948
474645444342414039383736
353433323130292827262524
232221201918171615141312
11109876543210
Index Numbers
Unfolded Pitch Histograms
Fig.1 - Unfolded Pitch Histograms of 2 Jazz pieces (left) and 2 Irish songs (right).
4
Pitch Histogram features
• Four dimensional feature vector– PITCH-Fold
– AMPL-Fold
– PITCH-Unfold
– DIST-Fold
Pitch Histogram Calculation
• For MIDI files:– The algorithm increments the corresponding note’s frequency
counter while using linear traversal over all MIDI events in thefile.
– Normalization
• For arbitrary audio files:– Multiple Pitch Detection Algorithm
Multiple Pitch Detection Algorithm
Fig.2 – Multiple Pitch Detection Flow Chart
Experiment Details
• Types of music contents:– symbolic (refers to MIDI)
– audio-from-MIDI (generated using a synthesizer playing a MIDI file)
– audio (digital audio files like mp3’s found on the web)
• Five musical genres are used:– Electronica, Classical, Jazz, Irish Folk and Rock
• Experiment Set:– A set of 100 musical pieces in MIDI format for each genre
– A set of 100 audio-from-MIDI pieces for each genre
– A set of 100 general audio files
• KNN(3) Classifier
5
Classification Results in MIDI
Fig.3 – Classification accuracy comparison of random and MIDI
Classification Results in MIDI
Classification Results in MIDI
Fig.4 – Pair-wise evaluation in MIDI
Classification Results in MIDI
Fig.5 – Average classification accuracy as a function of the length of input MIDI data
6
Classification Results in Audio-from-MIDI
Fig.6 - Classification accuracy comparison of random and Audio-from-MIDI
Classification Results in Audio-from-MIDI
Comparison of Classification Results
Fig.7 – Classification accuracy comparison
Implementation
ÿ MARSYAS– MusicAl Research SYstem for Analysis and Synthesis
– the software used for audio Pitch Histogram calculation andmusical genre classification.
– Three distinct modes of visualization:• Standard Pitch Histogram plots
• 3D pitch-time surfaces
• Projection of the pitch-time surfaces onto a 2D bitmap
7
MARSYAS Visualization
Fig.8 – Examples of grayscale pitch-time surfaces. Jazz (top) and Irish Folk music (bottom)
Summary
• Symbolic representation is more preferable in the senseof computing Pitch Information.
• This work can be viewed as an attempt to bridge the twodistinct MIR approaches by using Pitch Histograms.
• Pitch Histograms do carry a certain amount of genre-identifying information.
• Multiple Pitch Detection Algorithm is not perfect, but itworks by a certain degree.
Future Work
• Real-time running version of Pitch Histogram.– for better classification performance.
– to conduct more detailed harmonic analysis such as figured bassextraction, tonality recognition, and chord detection.
• The features derived from Pitch Histograms might beapplicable to the problem of content-based audio identificationor audio fingerprinting.
• Alternative feature sets are needed.
• Query-based retrieval mechanism for audio music signals.
Thanks
• Cosku Turhan for the art work on my slides…
• 4 Non Blondes for their song, “What's Up?” :)