presented by: shailesh deshpande (shailesh@vt)
DESCRIPTION
- PowerPoint PPT PresentationTRANSCRIPT
Music Information Retrieval-or-
how to search for (and maybefind) music and do away with
incipitsMichael Fingerhut
Multimedia Library andEngineering Bureau
IRCAM – Centre PompidouIAML - IASA 2004 Congress, Oslo
IRCAM - Institut de Recherche et Coordination Acoustique/MusiqueIAML- International Association of Music LibrariesIASA – International association of sound archives
Presented by: Shailesh Deshpande ([email protected])
06/28/2009
Agenda
Introduction Why MIR? Take 1: multi-disciplinary domain Take 2: schematic Take 3: typology Challenges IRCAM cataloging tool
Introduction
Music information retrieval (MIR) is the interdisciplinary science of retrieving information from music
Paper presents three views of this domain Challenges What is an incipit?
First few words or opening line of a book. In music – first few notes of a composition.
Why MIR?
Storage => increased availability of musical content in digital form (locally) CD’s, DVD’s, iPods
Computing power => faster processing of large volumes of digitized content
Networks => increased availability of musical content in digital form (remotely) Pandora, Yahoo Music, iTunes
Technological advances + demand from consumers = attention of research and industry
Take 1: multi-disciplinary domain General
Computer Science, Data Processing, AI, Pattern Recognition, Library & Information Sciences
Philosophy and Psychology Sensory Perception, Emotions & feelings, Mental processes &
intelligence Social Sciences
Sociology & Anthropology, Culture & Institutions, Law, Commerce Natural Science & Mathematics General Technology
Electric, Electronic, Magnetic, Communications & Computer Engineering
The Arts Music, Aesthetics, Composition
Take 2: schematic representation of MIR
Take 3: a typology of MIR
PreprocessingOCR, digitization, compressionEncoding, notationFeature extractionSegmentationInstrument recognitionVoice recognition
IndexingIdentificationClusteringClassification
ExtractionMelody, Key, Harmony, Rhythm
Structural analysisPolyphonyRepetitionSimilaritySummarization
OrganizationDatabases, systems, networksCompressionSynchronizationMetadata
Search Objective criteria
Metadata indices (name, title, period, genre, instrumentation)Full-text (with or without semantic tags)Query by example (audio excerpt, melody, contour, rhythm, tonality, harmony)SimilarityAcoustical characteristics
Subjective criteriaMoodTaste
Retrieve, deliver, useBrowsing Playlists Using and reusing (annotate, combine, transform) Rights management (recognition, watermarking)
UsabilityEvaluation User studies
Music terms used in MIR
Pitch – perceived fundamental frequency of a sound. Maybe different from actual frequency because of harmonics.
Timbre – the quality of a musical note that distinguishes different types of sound production, such as voices or musical instruments (saxophone vs. trumpet – with same pitch and loudness)
Rhythm (aka beat) - the variation of the length and accentuation of a series of sounds
Tempo – the speed or pace of a musical piece. Usually affects the Mood of a song.
Melody – a linear succession of musical tones which is perceived as a single entity (‘horizontal’ aspect of music)
Harmony – simultaneous use of different pitches (‘vertical’ aspect of music)
Monophony – musical texture consisting of melody without accompanying harmony
Polyphony - is a texture consisting of two or more independent melodic voices
Common Methods
Modeling: start from a theory, look for patterns Look for melodies, harmonic progressions Attempt to find elements in data that
correspond to such entities Statistical methods: look for patterns, build
a theory Perform statistical analysis on data, find
common patterns and group them in clusters Attempt to interpret their occurrence in
musical pieces
MIR Challenges
The integration of audiovisual, symbolic and textual data
Fingerprinting - unique small set of features excerpted from a sound file, allowing to discriminate it from any other sound file
Music Summarization- how to select a representative excerpt that gives a good idea of the work (similar to thumbnails for image files)
Computing Similarity – no unique way in which two pieces may be similar Melodic, Rhythmic, Timbre, Genre, Style similarities
Indexing a musical piece by melody – to allow QBH interface
MIR Challenges contd..
Encoding of music – at acoustic, structural and semantic levels
Query-by-example – search for music by singing, humming, whistling or playing an audio excerpt
Watermarking – adding identification information to digital audio for DRM
Benchmarking - limited number of standardized test collections available for evaluation of MIR systems
A tool to catalog and extract audioCD contents for online distribution Automatic identification of CDs
Compute CDDB of the CD CDDB - a binary number reflecting the offsets
(start time) and lengths of the tracks of the CD
Metadata retrieval and correction Query Internet CDDB for metadata Allow correction
Extraction and compression Transfer to a Web server
IRCAM tool interface
When a CD is inserted in the computer:
-The tool computes its CDDB
- Retrieves the metadata if available (freedb.org, cddb.com, allmusic.com)
- Allows the librarian to correct errors, structure the tracks into works and select names from authority lists.
- When done, it adds themetadata to the catalog, and extracts the tracks, compresses them and sendsthem to the audio server.
Information sources
The International Society for Music Information Retrieval (http://www.ismir.net/)
University of Illinois’ Graduate School of Library and Information Science (http://www.music-ir.org/)
IRCAM (http://www.ircam.fr/) http://articles.ircam.fr/textes/Fingerhut04b/
The Listen Game — UCSD Computer Audition Lab MIR music ranking game (Herd It on Facebook) Multi-player game where you listen to music with lots of
other people (aka the Herd). You are asked to describe the music (genre, mood, singer etc.) and get points when the Herd agrees with you.
Innovative way to harness the power of social networking and collect metadata for MIR