42nd international conference ilmenau, germany · 2016-12-22 · 42nd international conference...

6
42nd International Conference Ilmenau, Germany Ilmenau, Germany 752 J. Audio Eng. Soc., Vol. 59, No. 10, 2011 October Conference report

Upload: others

Post on 08-Aug-2020

15 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 42nd International Conference Ilmenau, Germany · 2016-12-22 · 42nd International Conference Ilmenau, Germany 752 J. Audio Eng. Soc., Vol. 59, No. 10, 2011 October

4422nndd IInntteerrnnaattiioonnaall CCoonnffeerreenncceeIlmenau, GermanyIlmenau, Germany

752 J. Audio Eng. Soc., Vol. 59, No. 10, 2011 October

Conference report

Page 2: 42nd International Conference Ilmenau, Germany · 2016-12-22 · 42nd International Conference Ilmenau, Germany 752 J. Audio Eng. Soc., Vol. 59, No. 10, 2011 October

4422nndd IInntteerrnnaattiioonnaall CCoonnffeerreenncceeIIllmmeennaauu,, GGeerrmmaannyy

J. Audio Eng. Soc., Vol. 59, No. 10, 2011 October 753

The rapid evolution of digital audio technologies, audio com-pression, and streaming has significantly changed the waydigital audio is created, processed, and consumed today. New

audio content can be produced at a very low cost. Thus the sheeramount of available audio data grows more and more each day. Con-tent-based management of audio recordings is essential to enableefficient use of these data. Aside from audio retrieval and recom-mendation techniques, the semantics of audio signals are also

Conference cochairs: Mark Sandler(above) and Karlheinz Brandenburg (right)

Papers cochairs: Christian Dittmar (left)and Anssi Klapuri (right)

Page 3: 42nd International Conference Ilmenau, Germany · 2016-12-22 · 42nd International Conference Ilmenau, Germany 752 J. Audio Eng. Soc., Vol. 59, No. 10, 2011 October

becoming more important in object-oriented audio coding, edit-ing, and processing. Recent product releases demonstrate this,but many more innovative functionalities are becoming avail-able.This was the motivation for the AES 42nd International

Conference Semantic Audio, held in Ilmenau, Germany, July 22–24. The conference was concerned with research anddevelopment targeted toward content-based handling of digitalaudio recordings. Its aim was to give a broad overview of thestate of the art, address many of the new scientific disciplinesinvolved in this still-emerging field, and continue to foster thisline of interdisciplinary research. From the outset, it was agreedthat the conference should feature a large number of invitedtalks. This brought together the luminaries in the field, andallowed all attendees to get a big-picture view of the state of theart in semantic audio and where it is heading.Although the conference was hosted by the Technical

University of Ilmenau, it was a joint effort of the University,Fraunhofer IDMT, and Queen Mary University of London(QMUL). It was cochaired by Karlheinz Brandenburg (IDMT) and

Mark Sandler (QMUL). Papers cochairs were Christian Dittmar(IDMT) and Anssi Klapuri (QMUL). The local organization teamwas led by Peggy Walther of IDMT.The technical program was full and varied. There were 21

technical talks and an additional 13 poster presentations. Theschedule was arranged into sessions each dealing with particularaspects of semantic audio, such as music information retrieval,speech processing and analysis, automatic music transcription,audio source separation, and intelligent audio effects. It coveredfeature extraction techniques, ontologies, and classification andrecommendation systems as well as evaluation methods. Introductory remarks on the first day were given by Karlheinz

Brandenberg. The technical sessions on Friday set the scene forthe conference, with overview talks on advances in music infor-mation retrieval, interactive music applications, and semanticspeech tagging. One inspiring talk was given by Masataka Goto,

CONFERENCE REPORT

Holger Grossmann, facilities chair (left), and Peggy Walther, secretary andwebmaster

Delegates enjoy getting to know each other over coffee.

754 J. Audio Eng. Soc., Vol. 59, No. 10, 2011 October

Page 4: 42nd International Conference Ilmenau, Germany · 2016-12-22 · 42nd International Conference Ilmenau, Germany 752 J. Audio Eng. Soc., Vol. 59, No. 10, 2011 October

CONFERENCE REPORT

who demonstrated several technologies that are changing theway we listen to and understand music. His presentation alsoproduced universal laughter from the entire audience, when hepointed out that music is an environmental friendly, energy-saving resource. Unlike film, which is rarely watched by anyonemany times over, music is a highly renewable resource that canbe appreciated more with repeated listening. Of course, semanticaudio is not just related to music, and the second paper sessionmade this clear, with talks on semantic speech tagging, synchro-nization of speech and text, and automatic speech recognition.This session made clear just how challenging the tasks are andhow advanced the technologies have become. For instance, it ispossible to use machine learning in some cases to estimate theage, ethnic background, and even height of the speaker. Thethird paper session was dedicated to automatic music transcrip-tion, often called the holy grail or grand challenge in musicalsignal processing. A highlight in that session was KarinDressler’s talk on pitch estimation. As pointed out by sessionchair Anssi Klapuri, she has come first in community-based eval-uation of pitch-estimation techniques several years in a row, andher talk gave everyone an opportunity to find out how sheaccomplished this. The first day concluded with a three-coursedinner at the Hotel Lindenhof.On Saturday, the talks began with an explanation of adaptive

distance measures for exploring and structuring music collec-tions by Sebastian Stober, bringing together several years’ workin this field. This was followed by a thought-provoking talk byCynthia Liem on applying alignment techniques in order tocompare expressivity across musical performances. The morningalso saw a focused session on audio source separation, includinga method to extract singing voice from stereo recordings. Theafternoon began with the second poster session. Two interestingposters were on a psychoacoustic approach to wavefield synthesisand on the use of gaze tracking to observe uncertainty in music

tagging, both illus-trating the interdis-ciplinary aspects ofsemantic audio. Thefinal paper session ofthe day concernedinformed sourceseparation. Incontrast to blindsource separation,the informed app-roach exploits someadditional (semantic)knowledge for im-proved performance.This session includeda talk by JürgenHerre on parametriccoding of audio ob-jects, overviewingthe state of the art,commercial applica-tions, and opportuni-ties for the future.Saturday evening

began with a visit toFraunhofer IDMTlabs. The attendeeswere treated to toursof several spatialaudio labs and an anechoic chamber, a demonstration of high-performance flat-panel loudspeakers, and a preview of their newsemantic audio application, Songs2See, which features many ofthe technologies discussed during the conference. This was then

Karin Dressler discusses pitch estimationduring the session on automatic musictranscription, which is often called the holygrail of musical signal processing.

Pablo Cabañas Molero discusses theseparation of singing voices from stereorecordings.

J. Audio Eng. Soc., Vol. 59, No. 10, 2011 October 755

Page 5: 42nd International Conference Ilmenau, Germany · 2016-12-22 · 42nd International Conference Ilmenau, Germany 752 J. Audio Eng. Soc., Vol. 59, No. 10, 2011 October

followed by a barbecue and concert, and for some the festivitiescontinued late into the night.The final day began with a talk on automated chord recogni-

tion, a topic that has seen much interest recently. The finalsession of the conference was on intelligent audio effects. JoshReiss gave an overview of how multichannel intelligent effectscan be used for live sound. This was followed by Christof Faller’sdiscussion of advances in microphone designs that enable betterdirectivity and improved surround recording. These two talksillustrated how semantics can be exploited not just for postpro-cessing and interaction tasks, but also for recording and liveperformance. The final talk of the session was given jointly byGyorgy Fazekas and Thomas Wilmering. It dealt with the designand use of semantic web ontologies for multitrack music produc-

tion. This exploratory work was a fitting endto the technical sessions. For those not trav-elling back home that day, there was adelightful excursion to Eisenach. Thisincluded lunch at the charming, western-themed Lasso restaurant, and a visit to theBachhaus, the first museum in the worlddedicated to Johann Sebastian Bach.

CONFERENCE REPORT

756 J. Audio Eng. Soc., Vol. 59, No. 10, 2011 October

Invited speakers educate the conference delegates: top row, fromleft, Masataka Goto, Björn W. Schuller, Meinard Müller, and JürgenHerre; bottom row, Sebastian Stober, left, and Christof Faller.

Delegates crowdaround posters tohear authors suchas Sang Ha Park(below) presenttheir ideas aboutsemantic audio.

From left, Thippur Srivinas, LaurentGirin, and Estefanía Cano explain theprinciples of source separation.

Editor’s note: A printed proceedings (or a PDFof the entire book) of the conference papers canbe purchased online at www.aes.org/publications/conf.cfm. Individual papers are available in the AES E-Library atwww.aes.org/e-lib/.

Page 6: 42nd International Conference Ilmenau, Germany · 2016-12-22 · 42nd International Conference Ilmenau, Germany 752 J. Audio Eng. Soc., Vol. 59, No. 10, 2011 October

CONFERENCE REPORT

J. Audio Eng. Soc., Vol. 59, No. 10, 2011 October 757

The band plays on during an enjoyable evening ofentertainment.

Social events during the conference gave everyone an opportunity to relaxand get acquainted. Clockwise from top left: delegates enjoy one of theexcellent buffet lunches; a trip to the Bach museum at Eisenach; VerenaHart of Fraunhofer IDMT is surprised with a birthday cake; and dinner atthe western-themed Lasso restaurant.