dtw for qbsh

20
DTW for QBSH J.-S Roger Jang ( 張張張 ) http://mirlab.org/jang MIR Lab, CSIE Dept. National Taiwan University

Upload: calais

Post on 30-Jan-2016

47 views

Category:

Documents


0 download

DESCRIPTION

DTW for QBSH. J.-S Roger Jang ( 張智星 ) http://mirlab.org/jang MIR Lab , CSIE Dept. National Taiwan University. Dynamic Time Warping (DTW). Goal: Allows comparison of high tolerance to tempo variation Characteristics: Robust for irregular tempo variations - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: DTW for QBSH

DTW for QBSH

J.-S Roger Jang (張智星 )

http://mirlab.org/jang

MIR Lab, CSIE Dept.

National Taiwan University

Page 2: DTW for QBSH

-2-

Dynamic Time Warping (DTW)

Goal: Allows comparison of high tolerance to tempo variation

Characteristics: Robust for irregular tempo variations Trial-and-error for dealing with key transposition

Expensive in computation Does not conform to triangle inequality Some indexing algorithms do exist

Page 3: DTW for QBSH

-3-

Type-1 DTW

i

j

t(i-1)

r(j)

),(minAnswer 3.

|)1()1(|)1,1(

)1,2(

)1,1(

)2,1(

min|)()(|),(

),(for formula Recurrent.2

):1( and ):1( between distanceDTW :),( .1

jmD

rtD

jiD

jiD

jiD

jritjiD

jiD

jritjiD

j

),( jiD

t: input pitch vector (8 sec)r: reference pitch vectorLocal paths: 27-45-63 degrees

3-step formula for type-1 DTW(with anchored beginning)

r(j-1)

t(i)

Page 4: DTW for QBSH

-4-

Type-2 DTW

i

j

t(i-1)

r(j) ),( jiD

r(j-1)

t(i)

t: input pitch vector (8 sec)r: reference pitch vectorLocal paths: 0-45-90 degrees

3-step formula for type-2 DTW(with anchored beginning)

),(minAnswer 3.

|)1()1(|)1,1(

),1(

)1,1(

)1,(

min|)()(|),(

),(for formula Recurrent.2

):1( and ):1( between distanceDTW :),( .1

jmD

rtD

jiD

jiD

jiD

jritjiD

jiD

jritjiD

j

Page 5: DTW for QBSH

-5-

Local Path Constraints

Type 1: 27-45-63 local paths

Type 2: 0-45-90 local paths

jiD ,

jiD ,

),1(

)1,1(

)1,(

min

)()(),(

jiD

jiD

jiD

jritjiD

)1,2(

)1,1(

)2,1(

min

)()(),(

jiD

jiD

jiD

jritjiD

2,1 jiD

1, jiD 1,1 jiD

jiD ,1

1,1 jiD 1,2 jiD

Page 6: DTW for QBSH

-6-

Path Penalty

Goal: To avoid paths deviated from 45 degrees

Path penalty Small/no penalty for 45-degree path Large penalty for paths deviated from 45-degree

)1,2(

)1,1(

)2,1(

min)()(),(

jiD

jiD

jiD

jritjiD

),( jiD

)2,1( jiD

)1,2( jiD

)1,1( jiD

0

Page 7: DTW for QBSH

-8-

DTW Paths of “Anchored Beginning”

Anchored beginning end position is free to move

Assumption: The speed of a user’s acoustic input falls within 1/2 and 2 times of that of the intended song.

DTW table size for 8-sec query = 250x180 250 = 31.25*8 375 = 250*1.5

i

j

Page 8: DTW for QBSH

-9-

DTW Paths of “Anchored Anywhere”

Anchored anywhere Both ends are free to move.

DTW table size for 8-sec query against 3-min song = 250 x 5620 250 = 31.25*8 5620 = 31.25*180

i

j

Page 9: DTW for QBSH

-10-

4

2

8

8

2

3

4

1

2 3 7 8 2

7

2

5

0

1

6

6

0

2

1

1 3 4 2

4 0 1 5

1 5 6 0

5 1

1

5

4

3

6

5

1

0

1

2 7

4

5

6

0

0 6

6

0

1

2

1

Page 10: DTW for QBSH

-11-

4

2

8

8

2

3

4

1

2 3 7 8 2

7

2

5

0

1

6

6

0

2

1

2

1

2

5

5

7

0

10

3

1

6

6

4

7

7 6

5

12

7

1

6

2

4

7

1

2

1 3 4 2

4 0 1 5

01 5 6 0

1

2

1

0

6

65 1

1

5

4

3

6

5

1

0

1

2 7

4

5

6

0

0 6

6

0

1

2

1

Page 11: DTW for QBSH

-13-

Implementation Issues

To save memory Use 2-column table for type-1 DTW Use 1-column table for type-2 DTW

To avoid too many if-then statements Pad type-1 DTW with two-layer padding Pad type-2 DTW with one-layer padding

To find a suitable path Minimizing total distance Minimizing average distance

Page 12: DTW for QBSH

-14-

Other Variants

Local constraints

Flexible start/ending pos.

Page 13: DTW for QBSH

-15-

DTW Path of “Anchored Beginning”

Page 14: DTW for QBSH

-16-

DTW Path of “Anchored Anywhere”

Page 15: DTW for QBSH

-17-

Another Two Views of DTW Path of “Anchored

Anywhere”

Page 16: DTW for QBSH

-19-

Key Transposition (1/2)

Goal: Allow users’ input of different keys

Method 1: Mean shift and heuristic modification

5 DTW computation when compared to each song

Mean

-4 40-2 21 3

t-2t+2(t’)t’-1 t’+1t

Page 17: DTW for QBSH

-20-

Key Transposition (2/2)

Method 2: Fixed point iteration Step 1: DTW alignment Step 2: Stop if mapping path fixed Step 3: Shift to the same mean based on the alignment

Step 4: Go back to step 2.Characteristics

DTW distance monotonically non-increasing to guarantee convergence

Page 18: DTW for QBSH

-24-

Type-3 DTW:Frame to Note Alignment

DP-based method for filling the table:

67

64

65

Frame-levelPitch vector

Notes

)1,1(

),1(min|)()(|),(

jiD

jiDjritjiD

jiD ,

1,1 jiD

jiD ,1

Recurrent formula: Local constraint:

62

65

Page 19: DTW for QBSH

-25-

Type-3 DTW

Characteristics Frame-based query input vs. note-based music database

Note duration unused

More efficient, less effective

Heuristics for key-transposition

Mapping path

Page 20: DTW for QBSH

-26-

Type-3 DTW:Effects of Key Transposition

Rough key transpos.

Fine key transpos.

Please refer to the online tutorial page for playback.