musicmood - machine learning in automatic music mood prediction based on song lyrics

40
MusicMood Machine Learning in Automatic Music Mood Prediction Based on Song Lyrics Sebastian Raschka December 10, 2014

Upload: sebastian-raschka

Post on 12-Jul-2015

4.658 views

Category:

Data & Analytics


0 download

TRANSCRIPT

Page 1: MusicMood - Machine Learning in Automatic Music Mood Prediction Based on Song Lyrics

MusicMoodMachine Learning in Automatic Music Mood Prediction

Based on Song Lyrics

Sebastian Raschka December 10, 2014

Page 2: MusicMood - Machine Learning in Automatic Music Mood Prediction Based on Song Lyrics

Music Mood Prediction

• We like to listen to music [1][2]

• Digital music libraries are growing

• Recommendation system for happy music

(clinics, restaurants ...) & genre selection

[1]  Thomas Schaefer, Peter Sedlmeier, Christine Sta ̈dtler, and David Huron. The psychological functions of music listening. Frontiers in psychology, 4, 2013.

[2]  Daniel Vaestfjaell. Emotion induction through music: A review of the musical mood induction procedure. Musicae Scientiae, 5(1 suppl):173–211, 2002.

Page 3: MusicMood - Machine Learning in Automatic Music Mood Prediction Based on Song Lyrics

Predictive Modeling

Unsupervised learning

Supervised learning

Regression ClassificationClustering

Reinforcement learning

Ranking Hidden Markov models

DBSCAN on a toy dataset Naive Bayes on Iris (after LDA)

Page 4: MusicMood - Machine Learning in Automatic Music Mood Prediction Based on Song Lyrics

Supervised Learning In a Nutshell

Page 5: MusicMood - Machine Learning in Automatic Music Mood Prediction Based on Song Lyrics

Feature Extraction

Feature Selection

Dimensionality Reduction

Normalization

Raw Data Collection

Pre-processing

Sampling

Test Dataset

Training Dataset

Training Learning Algorithms

Post-Processing

Cross Validation

Final Classification/Regression Model

New DataPre-processing

RefinementPrediction

Split

Supervised Learning

Sebastian Raschka 2014

Missing Data

Prediction-error Metrics

Model Selection

Hyperparameter optimization

This work is licensed under a Creative Commons Attribution 4.0 International License.

Supervised Learning - A Quick Overview

Page 6: MusicMood - Machine Learning in Automatic Music Mood Prediction Based on Song Lyrics

MusicMood - The Plan

Page 7: MusicMood - Machine Learning in Automatic Music Mood Prediction Based on Song Lyrics

http://labrosa.ee.columbia.edu/millionsong/

The Dataset

Page 8: MusicMood - Machine Learning in Automatic Music Mood Prediction Based on Song Lyrics

Lyrics available? Lyrics in English?

Sampling

200 songs for validation

1000 songs for training

http://lyrics.wikia.com/Lyrics_Wiki Python NLTK

Page 9: MusicMood - Machine Learning in Automatic Music Mood Prediction Based on Song Lyrics

Mood Labels

Downloading mood labels from Last.fm Manual labeling based on lyrics and listening

• Dark topic (killing, war, complaints about politics, ...) • Artist in sorrow (lost love, ...)

sad if ...

Page 10: MusicMood - Machine Learning in Automatic Music Mood Prediction Based on Song Lyrics
Page 11: MusicMood - Machine Learning in Automatic Music Mood Prediction Based on Song Lyrics

Word Clouds

happy:

sad:

Page 12: MusicMood - Machine Learning in Automatic Music Mood Prediction Based on Song Lyrics

A Short Introduction to Naive Bayes Classification

Page 13: MusicMood - Machine Learning in Automatic Music Mood Prediction Based on Song Lyrics

Naive Bayes - Why?• Small sample size, can outperform the more powerful

alternatives  [1]

• "Eager learner" (on-line learning vs. batch learning)

• Fast for classification and re-training

• Success in Spam Filtering [2]

• High accuracy for predicting positive and negative classes in a sentiment analysis of Twitter data [3]

[1]   Pedro Domingos and Michael Pazzani. On the optimality of the simple bayesian classifier under zero-one loss. Machine learning, 29(2-3):103–130, 1997.

[2]   Mehran Sahami, Susan Dumais, David Heckerman, and Eric Horvitz. A bayesian approach to filtering junk e-mail. In Learning for Text Categorization: Papers from the 1998 workshop, volume 62, pages 98–105, 1998.

[3]  Alec Go, Richa Bhayani, and Lei Huang. Twitter sentiment classification using distant supervision. CS224N Project Report, Stanford, pages 1–12, 2009.

Page 14: MusicMood - Machine Learning in Automatic Music Mood Prediction Based on Song Lyrics

Bayes Classifiers It’s All About Posterior Probabilities

objective function: maximize the posterior probability

Page 15: MusicMood - Machine Learning in Automatic Music Mood Prediction Based on Song Lyrics

The Prior Probability

Maximum Likelihood Estimate (MLE)

Page 16: MusicMood - Machine Learning in Automatic Music Mood Prediction Based on Song Lyrics

The Effect of Priors on the Decision Boundary

Page 17: MusicMood - Machine Learning in Automatic Music Mood Prediction Based on Song Lyrics

Class-Conditional Probability

Maximum Likelihood Estimate (MLE)

"chance of observing feature given that it belongs to class " "

Page 18: MusicMood - Machine Learning in Automatic Music Mood Prediction Based on Song Lyrics

Evidence

just a normalization factor, can be omitted in decision rule:

Page 19: MusicMood - Machine Learning in Automatic Music Mood Prediction Based on Song Lyrics

Naive Bayes ModelsGaussian Naive Bayes

for continuous variables

Page 20: MusicMood - Machine Learning in Automatic Music Mood Prediction Based on Song Lyrics

Naive Bayes ModelsMulti-variate Bernoulli Naive Bayes

for binary features

Page 21: MusicMood - Machine Learning in Automatic Music Mood Prediction Based on Song Lyrics

Naive Bayes ModelsMultinomial Naive Bayes

Page 22: MusicMood - Machine Learning in Automatic Music Mood Prediction Based on Song Lyrics

Naive Bayes and Text Classification

Page 23: MusicMood - Machine Learning in Automatic Music Mood Prediction Based on Song Lyrics

The Bag of Words Model

Feature Vectors

Page 24: MusicMood - Machine Learning in Automatic Music Mood Prediction Based on Song Lyrics

Tokenization and N-grams

“a swimmer likes swimming thus he swims”

Page 25: MusicMood - Machine Learning in Automatic Music Mood Prediction Based on Song Lyrics

Stemming and Lemmatization

Porter Stemming

Lemmatization

Page 26: MusicMood - Machine Learning in Automatic Music Mood Prediction Based on Song Lyrics

Stop Word Removal

Page 27: MusicMood - Machine Learning in Automatic Music Mood Prediction Based on Song Lyrics

Term and Frequency

Page 28: MusicMood - Machine Learning in Automatic Music Mood Prediction Based on Song Lyrics

Term Frequency - Inverse Document Frequency (Tf-idf)

Page 29: MusicMood - Machine Learning in Automatic Music Mood Prediction Based on Song Lyrics

• TP = true positive (happy predicted as happy) • FP = false positive (sad predicted as happy) • FN = false negative (happy predicted as sad)

Grid Search and 10-fold Cross Validation to Optimize F1

Page 30: MusicMood - Machine Learning in Automatic Music Mood Prediction Based on Song Lyrics

K-Fold Cross Validation

Page 31: MusicMood - Machine Learning in Automatic Music Mood Prediction Based on Song Lyrics

10-Fold Cross Validation After Grid Search

(final model)

Page 32: MusicMood - Machine Learning in Automatic Music Mood Prediction Based on Song Lyrics

10-fold Cross Validation (mean ROC) Multinomial vs Multi-variate Bernoulli Naive Bayes

Page 33: MusicMood - Machine Learning in Automatic Music Mood Prediction Based on Song Lyrics

10-fold Cross Validation (mean ROC) Multinomial Naive Bayes & Hyperparameter Alpha

Page 34: MusicMood - Machine Learning in Automatic Music Mood Prediction Based on Song Lyrics

10-fold Cross Validation (mean ROC) Multinomial Naive Bayes & Vocabulary Size

Page 35: MusicMood - Machine Learning in Automatic Music Mood Prediction Based on Song Lyrics

10-fold Cross Validation (mean ROC) Multinomial Naive Bayes & Document Frequency Cut-off

Page 36: MusicMood - Machine Learning in Automatic Music Mood Prediction Based on Song Lyrics

10-fold Cross Validation (mean ROC) Multinomial Naive Bayes & N-gram Sequence Length

Page 37: MusicMood - Machine Learning in Automatic Music Mood Prediction Based on Song Lyrics

Contingency Tables of the Final Model

training test

Page 38: MusicMood - Machine Learning in Automatic Music Mood Prediction Based on Song Lyrics

Live Demo

http://sebastianraschka.com/Webapps/musicmood.html

Page 39: MusicMood - Machine Learning in Automatic Music Mood Prediction Based on Song Lyrics

Future Plans• Growing a list of mood labels (majority rule).

• Performance comparisons of different machine learning algorithms.

• Genre prediction and selection based on sound.

Page 40: MusicMood - Machine Learning in Automatic Music Mood Prediction Based on Song Lyrics

Thank you!