cs 591 s1 – computational audio · where precisely do notes start? 2. beat tracking ! given an...
TRANSCRIPT
3/27/15
1
Computer Science
CS 591 S1 – Computational Audio
Today: Analyzing Rhythm
Analyzing rhythm: basic notions and motivations
Onset detection, beat tracking
Rhythm analysis
Tempo Estimation
Time Warping to account for variations in tempo
Wayne Snyder Computer Science Department
Boston University
What is Rhythm?
3/27/15
2
Rhythm Analysis: Breakdown of Phases
1. Onset Detection § Where precisely do notes start?
2. Beat tracking § Given an audio recording of a piece of music, determine
the periodic sequence of beat positions
3. Tempo and Meter Estimation § Interpreting the periodicity in musical terms (BPM, meter)
4. Analyzing Style and Musical Effects § Variations in tempo (intentional and unintentional) § Musical effects: Anticipations, rubato, fermata, playing
“behind the beat” (or ahead), swing.
Note: There is something of a “chicken or egg” problem with 2 and 3.... We’ll fold these together for simplicity....
Time (seconds)
Example: Queen – Another One Bites The Dust
Rhythm Analysis: Introduction
3/27/15
3
Example: Queen – Another One Bites The Dust
Time (seconds)
Rhythm Analysis: Introduction
Further Examples:
If I Had You (Benny Goodman) Shakuhachi Flute Liszt: Sonetto No. 104 Del Petrarca
Where is the beat? Can you tap your foot to it? What is the meter? How to find the underlying regular beat which is being varied by the composer and/or performer for expressive effect?
Rhythm Analysis: Introduction
3/27/15
4
Rhythm Analysis: Introduction Even when rhythm is regular, there is a complicated semantic
problem: rhythm is hierarchical, consisting of many
interrelated groupings:
Pulse level: Measure
Rhythm Analysis: Introduction
Pulse level: Tactus (beat)
3/27/15
5
Rhythm Analysis: Introduction
Example: Happy Birthday to you
Pulse level: Tatum (fastest unit of division)
In a sophisticated piece of music, these various levels are exploited
by the composer in complicated ways. How should it be notated
and described precisely? What is the time signature? Example:
Bach, WTC, Fugue #1 in C Major
Rhythm Analysis: Introduction
3/27/15
6
Rhythm Analysis: Introduction
§ Hierarchical levels often unclear
§ Global/slow tempo changes (all musicians do this!)
§ Local/sudden tempo changes (e.g. rubato)
§ Vague information
(e.g., soft onsets, false positives )
§ Sparse information: not all beats occur!
(often only note onsets are used)
Challenges in beat tracking
§ Onset detection § Beat tracking § Tempo estimation
Tasks
Introduction
3/27/15
7
§ Onset detection § Beat tracking § Tempo estimation
Tasks
Tasks in Rhythm Analysis
period phase
§ Onset detection § Beat tracking § Tempo estimation
Tasks
Tasks in Rhythm Analysis
3/27/15
8
Tempo := 60 / period Beats per minute (BPM)
§ Onset detection § Beat tracking § Tempo estimation
Tasks
period
Tasks in Rhythm Analysis
Onset Detection
§ Finding start times of perceptually relevant acoustic events in music signal
§ Onset is the time position where a note is played
§ Onset typically goes along
with a change of the signal’s properties: – energy or loudness – pitch or harmony – timbre
3/27/15
9
Onset Detection
[Bello et al., IEEE-TASLP 2005]
§ Finding start times of perceptually relevant acoustic events in music signal
§ Onset is the time position where a note is played
§ Onset typically goes along
with a change of the signal’s properties: – energy or loudness – pitch or harmony – timbre
Steps
Time (seconds)
Waveform
Onset Detection (Amplitude or Energy-Based)
3/27/15
10
Time (seconds)
Squared waveform
Steps 1. Amplitude squaring (full-wave rectification of power signal)
Onset Detection (Amplitude or Energy-Based)
Onset Detection (Amplitude or Energy-Based)
Time (seconds)
Steps 1. Amplitude squaring (full-wave rectification of power signal) 2. Windowing (taking mean or max in each window): “energy
envelope”
3/27/15
11
Onset Detection (Energy-Based)
Time (seconds)
Steps 1. Amplitude squaring (full-wave rectification of power signal) 2. Windowing (taking mean or max in each window) : “energy
envelope” 3. Difference Function (using appropriate Distance Function):
captures changes in signal energy: “novelty curve.”
Onset Detection (Energy-Based)
Time (seconds)
Steps 1. Amplitude squaring (full-wave rectification of power signal) 2. Windowing (taking mean or max in each window) : “energy
envelope” 3. Difference Function (using appropriate Distance Function):
captures changes in signal energy: “novelty curve.” 4. Half-wave Rectification (negative samples => 0.0): note onsets are
indicated by increases in energy only.
3/27/15
12
Onset Detection (Energy-Based)
Time (seconds)
Steps 1. Amplitude squaring 2. Windowing 3. Differentiation 4. Half wave rectification 5. Peak picking
Peak positions indicate note onset candidates
Energy based methods work well for percussive instruments, including piano:
Example: Bach Well-Tempered Clavier, Book 1,
Fugue #1 in C major (Glenn Gould)
3/27/15
13
Onset Detection
§ Energy curves often only work for percussive music
§ Many instruments have weak note onsets: wind, strings, voice. – Example: Shakuhachi Flute
§ Biggest problem: pitch or timbre changes may not correlate with energy changes (e.g., a singer may change the pitch without changing loudness).
§ More refined methods needed that capture changes in spectrum
[Bello et al., IEEE-TASLP 2005]
1. Spectrogram Magnitude spectrogram
Freq
uenc
y (H
z)
Time (seconds)
|| X Steps: Onset Detection (Spectral-Based)
§ Aspects concerning pitch, harmony, or timbre are captured by spectrogram
§ Allows for detecting local energy changes in certain frequency ranges
3/27/15
14
Compressed spectrogram Y
|)|1log( XCY ⋅+=
Onset Detection (Spectral-Based)
1. Spectrogram 2. Logarithmic compression
Steps:
§ Accounts for the logarithmic sensation of sound intensity
§ Dynamic range compression § Enhancement of low-intensity
values § Often leading to enhancement
of high-frequency spectrum
Time (seconds)
Freq
uenc
y (H
z)
Spectral difference
Onset Detection (Spectral-Based)
1. Spectrogram 2. Logarithmic compression 3. Differentiation
Steps:
§ First-order temporal difference
§ Captures changes of the spectral content
§ Only positive intensity changes considered
Time (seconds)
Freq
uenc
y (H
z)
3/27/15
15
Spectral difference
t Novelty curve
Onset Detection (Spectral-Based)
1. Spectrogram 2. Logarithmic compression 3. Differentiation 4. Accumulation: spectral
differences summarized by a number.
Steps:
§ Frame-wise accumulation of all positive intensity changes
§ Encodes changes of the spectral content
Freq
uenc
y (H
z)
Onset Detection (Spectral-Based)
1. Spectrogram 2. Logarithmic compression 3. Differentiation 4. Accumulation
Steps:
Novelty curve
3/27/15
16
Subtraction of local average
Onset Detection (Spectral-Based)
1. Spectrogram 2. Logarithmic compression 3. Differentiation 4. Accumulation 5. Normalization
Steps:
Novelty curve
Onset Detection (Spectral-Based)
1. Spectrogram 2. Logarithmic compression 3. Differentiation 4. Accumulation 5. Normalization
Steps:
Normalized novelty curve
3/27/15
17
Onset Detection (Spectral-Based)
1. Spectrogram 2. Logarithmic compression 3. Differentiation 4. Accumulation 5. Normalization 6. Peak picking
Steps:
Normalized novelty curve
Examples of Onset Detection:
WTC Fugue #1 (Bach) A Smooth One (Benny Goodman) Doc‘s Guitar (Doc Watson) WTC Prelude #5 (Bach) Poulenc, Valse No.114 Faure, Op.15, No.1
3/27/15
18
Beat and Tempo
§ Steady pulse that drives music forward and provides the temporal framework of a piece of music
§ Sequence of perceived pulses that are equally spaced in time
§ The pulse a human taps along when listening to the music
[Parncutt 1994]
[Sethares 2007]
[Large/Palmer 2002]
[Lerdahl/ Jackendoff 1983]
[Fitch/ Rosenfeld 2007]
What is a beat?
The term tempo then refers to the speed of the pulse.
Beat and Tempo
§ Analyze the novelty curve with respect to reoccurring or quasi-periodic patterns
§ Avoid the explicit determination of note onsets (no peak picking)
Strategy
3/27/15
19
Beat and Tempo
Strategy
§ Comb-filter methods § Autocorrelation § Fourier transfrom
Methods
§ Analyze the novelty curve with respect to reoccurring or quasi-periodic patterns—as if it were a musical signal and you are trying to find the component pitches (= periodic patterns of the novelty curve)
§ Avoid the explicit determination of note onsets (no peak picking)
Definition: A tempogram is a time-tempo representation that encodes the local tempo of a music signal over time (= spectrograph of novelty curve!).
Tem
po (B
PM
)
Time (seconds)
Inte
nsity
Tempogram
3/27/15
20
Definition: A tempogram is a time-tempo represenation that encodes the local tempo of a music signal over time (= spectrograph of novelty curve!).
§ Compute a spectrogram (STFT) of the novelty curve § Convert frequency axis (given in Hertz) into
tempo axis (given in BPM) § Magnitude spectrogram indicates local tempo
Fourier-based method
Tempogram (Fourier)
Tem
po (B
PM
)
Time (seconds)
Tempogram (Fourier)
Novelty curve
3/27/15
21
Tem
po (B
PM
)
Tempogram (Fourier)
Novelty curve (local window)
Time (seconds)
Tem
po (B
PM
)
Hann-windowed sinusoidal
Tempogram (Fourier)
Time (seconds)
3/27/15
22
Tem
po (B
PM
)
Hann-windowed sinusoidal
Tempogram (Fourier)
Time (seconds)
Tem
po (B
PM
)
Tempogram (Fourier)
Hann-windowed sinusoidal
Time (seconds)
3/27/15
23
Definition: A tempogram is a time-tempo represenation that encodes the local tempo of a music signal over time (= spectrograph of novelty curve!).
§ Compare novelty curve with time-lagged local sections of itself
§ Convert lag-axis (given in seconds) into tempo axis (given in BPM)
§ Autocorrelogram indicates local tempo
Autocorrelation-based method (cf. pitch determination algorithm).
Tempogram (Autocorrelation)
Tempogram (Autocorrelation)
Novelty curve (local window)
Lag
(sec
onds
)
Time (seconds)
3/27/15
24
Tempogram (Autocorrelation)
Windowed autocorrelation
Lag
(sec
onds
)
Tempogram (Autocorrelation)
Lag = 0 (seconds)
Lag
(sec
onds
)
3/27/15
25
Tempogram (Autocorrelation)
Lag = 0.26 (seconds)
Lag
(sec
onds
)
Tempogram (Autocorrelation)
Lag = 0.52 (seconds)
Lag
(sec
onds
)
3/27/15
26
Tempogram (Autocorrelation)
Lag = 0.78 (seconds)
Lag
(sec
onds
)
Tempogram (Autocorrelation)
Lag = 1.56 (seconds)
Lag
(sec
onds
)
3/27/15
27
Tempogram (Autocorrelation)
Time (seconds)
Time (seconds)
Lag
(sec
onds
)
300
60
80
40
30
120
Tempogram (Autocorrelation)
Tem
po (B
PM
)
Time (seconds)
Time (seconds)
3/27/15
28
600
500
400
300
200
100
Tempogram (Autocorrelation)
Tem
po (B
PM
)
Time (seconds)
Time (seconds)
Time (seconds)
Tempogram Fourier Autocorrelation
Time (seconds)
Tem
po (B
PM
)
3/27/15
29
Tempogram Fourier Autocorrelation
210
70
Tem
po (B
PM
)
Tempo@Tatum = 210 BPM Tempo@Measure = 70 BPM Time (seconds) Time (seconds)
Tempogram
Fourier Autocorrelation
Time (seconds) Time (seconds)
Tem
po (B
PM
)
Time (seconds)
Emphasis of tempo harmonics (integer multiples)
Emphasis of tempo subharmonics (integer fractions)
[Grosche et al., ICASSP 2010] [Peeters, JASP 2007]
3/27/15
30
Tempogram (Summary)
Fourier Autocorrelation
Novelty curve is compared with sinusoidal kernels each representing a specific tempo
Novelty curve is compared with time-lagged local (windowed) sections of itself
Convert frequency (Hertz) into tempo (BPM)
Convert time-lag (seconds) into tempo (BPM)
Reveals novelty periodicities Reveals novelty self-similarities
Emphasizes harmonics Emphasizes subharmonics
Granularity increases as tempo increases; Suitable to analyze tempo on tatum and tactus level
Granularity increases as tempo decreases; Suitable to analyze tempo on tatum and measure level
Beat Tracking
§ Given the tempo, find the best sequence of beats
§ Complex Fourier tempogram contains magnitude and phase information
§ The magnitude encodes how well the novelty curve
resonates with a sinusoidal kernel of a specific tempo
§ The phase optimally aligns the sinusoidal kernel with the peaks of the novelty curve
[Peeters, JASP 2005]
3/27/15
31
Beat Tracking
Tem
po (B
PM
)
Inte
nsity
[Peeters, JASP 2005]
Beat Tracking
Tem
po (B
PM
)
Inte
nsity
[Peeters, JASP 2005]
3/27/15
32
Beat Tracking
Tem
po (B
PM
)
Inte
nsity
[Peeters, JASP 2005]
Beat Tracking
Tem
po (B
PM
)
Inte
nsity
3/27/15
33
Tem
po (B
PM
)
Inte
nsity
Time (seconds)
Beat Tracking
[Grosche/Müller, IEEE-TASLP 2011]
Beat Tracking
Novelty Curve
Predominant Local Pulse (PLP)
[Grosche/Müller, IEEE-TASLP 2011] Time (seconds)
3/27/15
34
§ Periodicity enhancement of novelty curve § Accumulation introduces error robustness § Locality of kernels handles tempo variations
§ Indicates note onset candidates § Extraction errors in particular for soft onsets § Simple peak-picking problematic
Beat Tracking
Predominant Local Pulse (PLP)
Novelty Curve
[Grosche/Müller, IEEE-TASLP 2011]
Beat Tracking
§ Local tempo at time : [60:240] BPM
§ Phase
§ Sinusoidal kernel
§ Periodicity curve
[Grosche/Müller, IEEE-TASLP 2011]
3/27/15
35
Beat Tracking
Tem
po (B
PM
)
Time (seconds)
Borodin – String Quartet No. 2
[Grosche/Müller, IEEE-TASLP 2011]
Beat Tracking
Tem
po (B
PM
)
Borodin – String Quartet No. 2
[Grosche/Müller, IEEE-TASLP 2011]
Strategy: Exploit additional knowledge (e.g. rough tempo range)
Time (seconds)
3/27/15
36
Beat Tracking
Brahms Hungarian Dance No. 5 Te
mpo
(BP
M)
Beat Tracking
Brahms Hungarian Dance No. 5
Time (seconds)
Tem
po (B
PM
)
3/27/15
37
Applications
§ Feature design (beat-synchronous features, adaptive windowing)
§ Digital DJ / audio editing (mixing and blending of audio material)
§ Music classification § Music recommendation § Performance analysis
(extraction of tempo curves)
Application: Feature Design
Fixed window size
[Ellis et al., ICASSP 2008] [Bello/Pickens, ISMIR 2005]
3/27/15
38
[Bello/Pickens, ISMIR 2005]
Application: Feature Design
Fixed window size Adaptive window size
[Ellis et al., ICASSP 2008]
Application: Feature Design
Fixed window size (100 ms)
Time (seconds)
3/27/15
39
Application: Feature Design
Adative window size (roughly 1200 ms) Note onset positions define boundaries
Time (seconds)
Application: Feature Design
Time (seconds)
Denoising by excluding boundary neighborhoods
Adative window size (roughly 1200 ms) Note onset positions define boundaries
3/27/15
40
Application: Audio Editing (Digital DJ)
http://www.mixxx.org/
Application: Beat-Synchronous Light Effects
3/27/15
41
From last time: Matching Different Tempos Naive Strategy: Usage of multiple scaling
Beethoven/Karajan
Time (seconds)
Matching Procedure 2. Strategy: Usage of multiple scaling
Beethoven/Karajan
Time (seconds)
3/27/15
42
Matching Procedure 2. Strategy: Usage of multiple scaling
Query resampling simulates tempo changes
Beethoven/Karajan
Time (seconds)
Matching Procedure 2. Strategy: Usage of multiple scaling
Minimize over all curves
Beethoven/Karajan
Query resampling simulates tempo changes
Time (seconds)
3/27/15
43
We have implicitly assumed that the tempo of each piece is different but constant: T1 = C*T2. The more general problem is: how to find an alignment between two sequences with different time sequences which may not have a linear relationship?
Alignment
Warping
3/27/15
44
Solution is to do a correlation between all measurement points in each piece and build a Correlation Matrix:
Correlation Matrix
Find the warping path for point-wise best correlations:
3/27/15
45
Warping Path
Warping Path
Not all paths are correct! Have to impose a monotonicity condition (slope is always positive)
3/27/15
46
Warping Path
Not all paths are correct! Have to impose a monotonicity condition (slope is always positive) Must account for all steps in both pieces.
Strategy: Multiscale Approach
Compute optimal warping path on coarse level
3/27/15
47
Strategy: Multiscale Approach
Project on fine level
Strategy: Multiscale Approach
Specify constraint region
3/27/15
48
Strategy: Multiscale Approach
Compute constrained optimal warping path
Rhythm Analysis: Breakdown of Phases
1. Onset Detection § Novelty curve (something is changing) § Indicates note onset candidates § Hard task for non-percussive instruments (strings)
2. Tempo Estimation § Fourier tempogram § Autocorrelation tempogram § Musical knowledge (tempo range, continuity)
3. Beat tracking § Find most likely beat positions § Exploiting phase information from Fourier tempogram