mp25: audio fingerprinting and metadata correction with python
DESCRIPTION
TRANSCRIPT
![Page 1: Mp25: Audio Fingerprinting and metadata correction with Python](https://reader033.vdocuments.site/reader033/viewer/2022051616/55382f804a795979798b46b2/html5/thumbnails/1.jpg)
Audio fingerprinting and metadatacorrection with Python
Alastair Porter
November 21, 2011
![Page 2: Mp25: Audio Fingerprinting and metadata correction with Python](https://reader033.vdocuments.site/reader033/viewer/2022051616/55382f804a795979798b46b2/html5/thumbnails/2.jpg)
Me
Background in Computer ScienceMasters McGill Music TechOnline
http://github.com/alastair (20/28 music; 11 in python)http://twitter.com/alastairporter
![Page 3: Mp25: Audio Fingerprinting and metadata correction with Python](https://reader033.vdocuments.site/reader033/viewer/2022051616/55382f804a795979798b46b2/html5/thumbnails/3.jpg)
Python as a go-to language
Quick for prototypingUse the same code in a production releaseVery handy for API access (thin wrapper around urllib2)
![Page 4: Mp25: Audio Fingerprinting and metadata correction with Python](https://reader033.vdocuments.site/reader033/viewer/2022051616/55382f804a795979798b46b2/html5/thumbnails/4.jpg)
Music and Metadata
![Page 5: Mp25: Audio Fingerprinting and metadata correction with Python](https://reader033.vdocuments.site/reader033/viewer/2022051616/55382f804a795979798b46b2/html5/thumbnails/5.jpg)
Music and Metadata
The problem:People are really bad at naming musicInconsistent over releases
The solution:CrowdsourcingGet info from as many trusted sources as possibleMake renaming take no effort
![Page 6: Mp25: Audio Fingerprinting and metadata correction with Python](https://reader033.vdocuments.site/reader033/viewer/2022051616/55382f804a795979798b46b2/html5/thumbnails/6.jpg)
MusicBrainz
![Page 7: Mp25: Audio Fingerprinting and metadata correction with Python](https://reader033.vdocuments.site/reader033/viewer/2022051616/55382f804a795979798b46b2/html5/thumbnails/7.jpg)
Amazon
![Page 8: Mp25: Audio Fingerprinting and metadata correction with Python](https://reader033.vdocuments.site/reader033/viewer/2022051616/55382f804a795979798b46b2/html5/thumbnails/8.jpg)
Amazon (Coverart)
![Page 9: Mp25: Audio Fingerprinting and metadata correction with Python](https://reader033.vdocuments.site/reader033/viewer/2022051616/55382f804a795979798b46b2/html5/thumbnails/9.jpg)
Last.fm
![Page 10: Mp25: Audio Fingerprinting and metadata correction with Python](https://reader033.vdocuments.site/reader033/viewer/2022051616/55382f804a795979798b46b2/html5/thumbnails/10.jpg)
Last.fm (Genre tags)
![Page 11: Mp25: Audio Fingerprinting and metadata correction with Python](https://reader033.vdocuments.site/reader033/viewer/2022051616/55382f804a795979798b46b2/html5/thumbnails/11.jpg)
MusicBrainz
![Page 12: Mp25: Audio Fingerprinting and metadata correction with Python](https://reader033.vdocuments.site/reader033/viewer/2022051616/55382f804a795979798b46b2/html5/thumbnails/12.jpg)
albumidentify
http://github.com/albumidentify/albumidentify
![Page 13: Mp25: Audio Fingerprinting and metadata correction with Python](https://reader033.vdocuments.site/reader033/viewer/2022051616/55382f804a795979798b46b2/html5/thumbnails/13.jpg)
MP3, FLAC, Ogg, CDs
![Page 14: Mp25: Audio Fingerprinting and metadata correction with Python](https://reader033.vdocuments.site/reader033/viewer/2022051616/55382f804a795979798b46b2/html5/thumbnails/14.jpg)
Identification strategy
If there’s a CD TOC, use that (musicbrainz lookup)If no match, use audio fingerprintingIf no match, do a text lookup (artist/album)
![Page 15: Mp25: Audio Fingerprinting and metadata correction with Python](https://reader033.vdocuments.site/reader033/viewer/2022051616/55382f804a795979798b46b2/html5/thumbnails/15.jpg)
Fingerprinting
Converts an audio signal to a short sequence of numbersSmaller to compare than an entire filePerceptual features rather than byte comparison (workswith different encodings)
![Page 16: Mp25: Audio Fingerprinting and metadata correction with Python](https://reader033.vdocuments.site/reader033/viewer/2022051616/55382f804a795979798b46b2/html5/thumbnails/16.jpg)
Identification strategy
Fingerprinting gives us a set of candidate tracksA track could be on many albums (original release, best of,mix album)Keep a list of what tracks we have for each albumOnce we fill all the slots for an album, success!
![Page 17: Mp25: Audio Fingerprinting and metadata correction with Python](https://reader033.vdocuments.site/reader033/viewer/2022051616/55382f804a795979798b46b2/html5/thumbnails/17.jpg)
Metadata strategy
Text information from MusicbrainzGenre from last.fmImage from Amazon (or folder.jpg)Musicbrainz tells us where these are (don’t need to search)Save in every file (Text is cheap)
![Page 18: Mp25: Audio Fingerprinting and metadata correction with Python](https://reader033.vdocuments.site/reader033/viewer/2022051616/55382f804a795979798b46b2/html5/thumbnails/18.jpg)
Writing it all out
Custom MP3/ID3 writerOgg meta tagsFLAC meta tagsName files
Artist/Artist - Year - Album/01 - Artist - Track
Replaygain!Be a good citizen: Submit fingerprints to musicbrainz
![Page 19: Mp25: Audio Fingerprinting and metadata correction with Python](https://reader033.vdocuments.site/reader033/viewer/2022051616/55382f804a795979798b46b2/html5/thumbnails/19.jpg)
What’s next
New version of musicbrainzNew fingerprinterMore metadataMore metadata
![Page 20: Mp25: Audio Fingerprinting and metadata correction with Python](https://reader033.vdocuments.site/reader033/viewer/2022051616/55382f804a795979798b46b2/html5/thumbnails/20.jpg)
Thanks
More information:
MusicBrainz: http://musicbrainz.orgalbumidentify:http://github.com/albumidentify/albumidentify
More fingerprinting: http://acoustid.org,http://echoprint.me
Last.fm