[ieee 2003 ieee workshop on applications of signal processing to audio and acoustics - new paltz,...

1
2003 EEE Workshop on Applications of Signal Processing 10 Audio and Acoustics October 19-22, 2003. New Paltz. NY SESSION D: POSTER SESSION I - ICASSP’03 PAPERS Music processing, content analysis, perception, and coding D.01: Sub-channel Below the Perceptual Threshold in Audio Heping Ding, Narional Research Council, Orrawa. Ontario, Carlndo This paper explores the concept and possible ways of making use of an audio channels capacity below the perceptual threshold as a hidden sub-channel. Such an audio channel can be a one with con- ventional telephony or with voice-over-P. Since the sub-channel is hidden, not detectable by the human ear, a communications system equipped with this technology will he compatible with the exist- ing system in terms of audio signal uansmission. Two potential applications of the technology are concurrent services and the ex- tension of audio bandwidth. The latter application provides the listener with an improved audio quality. Recordings demonstrat- ing this will he played at the presentation. D.02: A Missing Feature Approach lo Instrument Identifica- tion in Polyphonic Music Jana Eggink and Guy 1. Brown, Universir). of Sheffield, Sheffield, UK Gaussian mixture model (GMM) classifiers have been shown to give good instrument recognition performance for monophonic music played by a single instrument. However, many applica- tions (such as automatic music transcription) require instrument identification from polyphonic, multi-instrumental recordings. We address this problem hy incorporating ideas from missing feature theory into a CMM classifier. Specifically, frequency regions that are dominated by energy from an interfering tone are marked as unreliable and excluded from the classification process. This ap- proach has been evaluated on random two-tone chords and an ex- cerpt from a commercially available compact disc, with promising results. D-03: On Psychoacoustic Noise Shaping for Audio Requanti- zation Dreten De Koning and Werner Verhelst, Vr-ije Universireif Brus- se/, Brussels, Belgium Signal requantization to reduce the word-length of an audio stream introduces distortions. Noise shaping can he applied in combina- tion with a psychoacoustic model in order to make requantization distortions minimally audible. The psychoacoustically optimal noise shaping curie depends on the time-varying characteristics of the input signal. Therefore, the noise shaping filter coeffi- cients are to he computed and updated on a regular basis. In this paper, we present a least squares theory for optimal noise shaping of au- dio signals. It provides shorter and more straightfonvard proof of known properties, and in contrast with the standard theory, it does show how noise shaping filters that attain the theoretical optimum can he designed in practice. D.04: Structural Analysis of iMusical Signals Via Pattern Matching Wei Chai, MlTMedia Ldoraro~y, Cambridge, MA, USA A musical piece typically has repetilive structures. Analysis of this structure can he used for music indexing, thumbnailing or seg- mentation. The research described here aims at automatically an- alyzing the repetitive smcture of musical signals. First, we detect the repetition of each segment in a piece using dynamic program- ming. Second, we summarize this repetition information and infer the structure based on some heuristic rules. The performance of our approach is demonstrated visually using figures for qualitative evaluation, and by two structural similarity measures for quantita- tive e\,aluation. The experimental results using a corpus of Beatles songs show that automatic structural analysis of music is possible. D.05 Application of Pitch Tracking to South Indian Classical Music Arvindh Krishnaswamy, Srorford Urriversir)., Sraiford, CA, USA We present results of applying pitch trackers to samples of South Indian classical (Carnatic) music. In particular, we investigate the various musical notes used and their intonation. We try different pitch tracking methods and observe their performance in Carnatic nusic analysis. Examining our data, we find only 12 distinct inter- vals per octave among the notes that are played with constant pitch. However. there are pitch inflexions used sometimes that are not mere omamentations -they are essential to the correa rendition of certain notes. Though these inflexions can be viewed as different versions of a particular note, they are certainly not equivalent to constant-pitch intervals like Just Intonation intervals, semitones or quanenones. D-06 Phase-Based Note Onset Detection for Music Signals Juan Pahlo Bello and Mark Sandler. University of London, Lon- don, UK Note onsets mark the beginning of attack transients, short areas of a note containing rapid changes of the signal spectral content. Detecting onsets is not trivial, especially when analysing complex mixtures. Applications for note onset detection systems include time stretching, audio coding and synthesis. An alternative to stan- dard energy-based onset detection is proposed by using phase in- formation. It is suggested that by observing the frame-by-frame distribution of differential angles, the precise moment when on- sets occur can he detected with accuracy. Statistical measures are used to build the detection function. The system is tested and tuned on a database of complex recordings. 49

Upload: a

Post on 25-Feb-2017

212 views

Category:

Documents


0 download

TRANSCRIPT

2003 EEE Workshop on Applications of Signal Processing 10 Audio and Acoustics October 19-22, 2003. New Paltz. NY

SESSION D: POSTER SESSION I - ICASSP’03 PAPERS

Music processing, content analysis, perception, and coding

D.01: Sub-channel Below the Perceptual Threshold in Audio Heping Ding, Narional Research Council, Orrawa. Ontario,

Carlndo

This paper explores the concept and possible ways of making use of an audio channels capacity below the perceptual threshold as a hidden sub-channel. Such an audio channel can be a one with con- ventional telephony or with voice-over-P. Since the sub-channel is hidden, not detectable by the human ear, a communications system equipped with this technology will he compatible with the exist- ing system in terms of audio signal uansmission. Two potential applications of the technology are concurrent services and the ex- tension of audio bandwidth. The latter application provides the listener with an improved audio quality. Recordings demonstrat- ing this will he played at the presentation.

D.02: A Missing Feature Approach lo Instrument Identifica- tion in Polyphonic Music

Jana Eggink and Guy 1. Brown, Universir). of Sheffield, Sheffield, UK

Gaussian mixture model (GMM) classifiers have been shown to give good instrument recognition performance for monophonic music played by a single instrument. However, many applica- tions (such as automatic music transcription) require instrument identification from polyphonic, multi-instrumental recordings. We address this problem hy incorporating ideas from missing feature theory into a CMM classifier. Specifically, frequency regions that are dominated by energy from an interfering tone are marked as unreliable and excluded from the classification process. This ap- proach has been evaluated on random two-tone chords and an ex- cerpt from a commercially available compact disc, with promising results.

D-03: On Psychoacoustic Noise Shaping for Audio Requanti- zation

Dreten De Koning and Werner Verhelst, Vr-ije Universireif Brus- se/, Brussels, Belgium

Signal requantization to reduce the word-length of an audio stream introduces distortions. Noise shaping can he applied in combina- tion with a psychoacoustic model in order to make requantization distortions minimally audible. The psychoacoustically optimal noise shaping curie depends on the time-varying characteristics of the input signal. Therefore, the noise shaping filter coeffi- cients are to he computed and updated on a regular basis. In this paper, we present a least squares theory for optimal noise shaping of au- dio signals. It provides shorter and more straightfonvard proof of known properties, and in contrast with the standard theory, it does show how noise shaping filters that attain the theoretical optimum can he designed in practice.

D.04: Structural Analysis of iMusical Signals Via Pattern Matching

Wei Chai, MlTMedia Ldoraro~y, Cambridge, MA, USA

A musical piece typically has repetilive structures. Analysis of this structure can he used for music indexing, thumbnailing or seg- mentation. The research described here aims at automatically an- alyzing the repetitive smcture of musical signals. First, we detect the repetition of each segment in a piece using dynamic program- ming. Second, we summarize this repetition information and infer the structure based on some heuristic rules. The performance of our approach is demonstrated visually using figures for qualitative evaluation, and by two structural similarity measures for quantita- tive e\,aluation. The experimental results using a corpus of Beatles songs show that automatic structural analysis of music is possible.

D.05 Application of Pitch Tracking to South Indian Classical Music

Arvindh Krishnaswamy, Srorford Urriversir)., Sraiford, CA, USA

We present results of applying pitch trackers to samples of South Indian classical (Carnatic) music. In particular, we investigate the various musical notes used and their intonation. We try different pitch tracking methods and observe their performance in Carnatic nusic analysis. Examining our data, we find only 12 distinct inter- vals per octave among the notes that are played with constant pitch. However. there are pitch inflexions used sometimes that are not mere omamentations -they are essential to the correa rendition of certain notes. Though these inflexions can be viewed as different versions of a particular note, they are certainly not equivalent to constant-pitch intervals like Just Intonation intervals, semitones or quanenones.

D-06 Phase-Based Note Onset Detection for Music Signals Juan Pahlo Bello and Mark Sandler. University of London, Lon-

don, UK

Note onsets mark the beginning of attack transients, short areas of a note containing rapid changes of the signal spectral content. Detecting onsets is not trivial, especially when analysing complex mixtures. Applications for note onset detection systems include time stretching, audio coding and synthesis. An alternative to stan- dard energy-based onset detection is proposed by using phase in- formation. It is suggested that by observing the frame-by-frame distribution of differential angles, the precise moment when on- sets occur can he detected with accuracy. Statistical measures are used to build the detection function. The system is tested and tuned on a database of complex recordings.

49