hypergraph models of playlist dialects

Post on 26-Jun-2015

1.064 Views

Category:

Documents

7 Downloads

Preview:

Click to see full reader

DESCRIPTION

Playlist generation is an important task in music information retrieval. While previous work has treated a playlist collection as an undifferentiated whole, we propose to build playlist models which are tuned to specific categories or dialects of playlists. Toward this end, we develop a general class of flexible and scalable playlist models based upon hypergraph random walks. To evaluate the proposed models, we present a large corpus of categorically annotated, user-generated playlists. Experimental results indicate that category-specific models can provide substantial improvements in accuracy over global playlist models.

TRANSCRIPT

Hypergraph models ofplaylist dialects

Brian McFeeCenter for Jazz Studies/LabROSAColumbia University

Gert LanckrietElectrical & Computer Engineering University of California, San Diego

Lab

Laboratory for the Recognition andOrganization of Speech and Audio

ROSA

Automatic playlist generation

Evaluating playlist algorithms

1. Observe playlists from users

...2. Compute playlist

likelihoods

3. Compare algorithmsby likelihood scores

?

>

[M. & Lanckriet, 2011]

Evaluating playlist algorithms

Key idea:

Playlist algorithm =

Probability distributionover song sequences

[M. & Lanckriet, 2011]

Modeling playlist diversity

Playlists

Modeling playlist diversity

Road trip

Party mix

MixedGenre

Hip-hop

Data collection

http://www.artofthemix.org/

Started in 1998, users upload and share playlists

[Ellis, Whitman, Berenzweig, and Lawrence, ISMIR 2002]

The data: AotM-2011

• 98K songs indexed to Million Song Dataset

• 87K playlists (1998-2011), ~210K contiguous segments

• 40 playlist categories, user meta-data available

# Playlists per category

Mixed genreTheme

Rock-popAlternating DJ

IndieSingle artist

RomanticRoad trip

DepressionPunk

Break-upNarrativeHip-hop

SleepDance-house

ElectronicRhythm & blues

CountryCover

HardcoreRockJazzFolk

AmbientBlues

100 1000 104 105

# Playlists per category

Mixed genreTheme

Rock-popAlternating DJ

IndieSingle artist

RomanticRoad trip

DepressionPunk

Break-upNarrativeHip-hop

SleepDance-house

ElectronicRhythm & blues

CountryCover

HardcoreRockJazzFolk

AmbientBlues

100 1000 104 105

• Majority of playlists are Mixed genre

• Remaining categories: contextual/mood, genre, other

Our goals

• Which categories can we model? Are some harder than others?

• Which features are useful for playlist generation?

• Do transitions matter? Are some categories less diverse?

A simple playlist model

1. Start with a set of songs

A simple playlist model

2. Select a subset (e.g., jazz songs)

A simple playlist model

3. Select a song

A simple playlist model

4. Find subsets containing the current song

A simple playlist model

4. Select a new subset

A simple playlist model

5. Select a new song

A simple playlist model

6. Repeat...

A simple playlist model

6. Repeat...

Connecting the dots...

• Random walk on a hypergraph - Vertices = songs - Edges = subsets

Connecting the dots...

• Random walk on a hypergraph - Vertices = songs - Edges = subsets

• Learning: optimize edge weights from example playlists

Connecting the dots...

• Random walk on a hypergraph - Vertices = songs - Edges = subsets

• Learning: optimize edge weights from example playlists

• Sampling is efficient, edge labels provide transparency

The hypergraph random walk model

exp. prior

playlists

transitions

edge weights

Edge construction: example

• Audio: cluster songs by timbre

Edge construction: example

• Audio: cluster songs by timbre

• Multiple clusterings (k=16, 64, 256)

Audio-1 Audio-2

Audio-3

Audio-4

Edge construction: the kitchen sink

• Audio

• MSD taste profile

• Era

• Familiarity

• Lyrics

• Social tags

• Uniform shuffle

• Conjunctions: "TAG_jazz-&-YEAR_1959"

• 6390 edges, 98K vertices (songs)

Evaluation protocol

• Repeat x10: - Split playlist collection into 75% train/25% test - Learn edge weights on training playlists - Evaluate average likelihood of test playlists

• Compare gain in likelihood over uniform shuffle baseline

Experiment 1: global vs. categorical

• Fit one model per category

• Fit one global model to all categories

• Test on each category and compare likelihoods

• Question: When does categorical training improve accuracy?

Experiment 1: global vs. categorical

ALLMixed

ThemeRock-pop

Alternating DJIndie

Single artistRomanticRoad trip

PunkDepression

Break upNarrativeHip-hop

SleepElectronic

Dance-houseR&B

CountryCover songs

HardcoreRockJazzFolk

ReggaeBlues

0% 5% 10% 1 5% 20% 25%

Log-likelihood gain over uniform shuffle

Global modelCategory-specific

Uniform

Experiment 1: global vs. categorical

ALLMixed

ThemeRock-pop

Alternating DJIndie

Single artistRomanticRoad trip

PunkDepression

Break upNarrativeHip-hop

SleepElectronic

Dance-houseR&B

CountryCover songs

HardcoreRockJazzFolk

ReggaeBlues

0% 5% 10% 1 5% 20% 25%

Log-likelihood gain over uniform shuffle

Global modelCategory-specific

Uniform • Largest gains for genre playlists• No change for "hard" categories (e.g., Mixed, Alternating DJ, Theme)

Experiment 1: learned edge weights

Audio CF Era Familiarity Lyrics Tags Uniform

ALLMixed

ThemeRock-pop

Alternating DJIndie

Single ArtistRomanticRoadTrip

PunkDepression

Break UpNarrativeHip-hop

SleepElectronic music

Dance-houseRhythm and Blues

CountryCover

HardcoreRockJazzFolk

ReggaeBlues

Experiment 2: continuity?

• Do we need to model playlist continuity?

• Simplified model: - ignore transitions - choose each edge IID

• Question: Are some categories more diverse than others?

playlists

songs

exp. prior

edge weights

Uniform

Experiment 2: continuity

ALLMixed

ThemeRock-pop

Alternating DJIndie

Single artistRomanticRoad trip

PunkDepression

Break upNarrativeHip-hop

SleepElectronic

Dance-houseR&B

CountryCover songs

HardcoreRockJazzFolk

ReggaeBlues

Log-likelihood gain over uniform shuffle

-15% -10% -5% 0% 5% 10% 15% 20%

Global modelCategory-specific

Uniform

Experiment 2: continuity

ALLMixed

ThemeRock-pop

Alternating DJIndie

Single artistRomanticRoad trip

PunkDepression

Break upNarrativeHip-hop

SleepElectronic

Dance-houseR&B

CountryCover songs

HardcoreRockJazzFolk

ReggaeBlues

Log-likelihood gain over uniform shuffle

-15% -10% -5% 0% 5% 10% 15% 20%

Global modelCategory-specific• Most categories exhibit both

continuity AND diversity• Transitions are important!

EDGE SONG

Example playlists

EDGE SONG

70s & soulAudio #14 & funkDECADE 1965 & soul

Lyn Collins - ThinkIsaac Hayes - No Name BarMichael Jackson - My Girl

Rhythm & Blues

Audio #11 & downtempoDECADE 1990 & trip-hopAudio #11 & electronica

Everything but the Girl - BlameMassive Attack - Spying GlassBjörk - Hunter

Electronic music

Conclusions

• Category-specific models outperform global playlist models.

• Continuity matters!

• Proposed model is simple, efficient, and transparent

• AotM-2011 dataset available now!http://cosmal.ucsd.edu/cal/projects/aotm2011

Obrigado!

top related