embodied music meditation: a real-time interactive audio...
TRANSCRIPT
Embodied Music Meditation: A Real-time Interactive Audio-Visual System for Buddhist Mudras Exploration
Motivation
Problem Definition
Models and Approaches (cont’) Experiment and Results (cont’)
RAU, MARK ZHANG, YUN ZHOU, YIJUN CS 229 PROJECT, STANFORD UNIVERSITY
Model and Approaches Experiment and Results
1 2 3 4 5 6 7 mean
KNN 0.43 0.56 0.63 0.50 0.58 0.77 0.43 0.57
SVM 0.86 0.55 0.50 0.73 0.32 0.31 0.32 0.52
FANN 0.25 0.50 0.25 0.60 0.58 0.50 0.50 0.43
Table 2: comparison of different prediction models on different mudras. The result is on test set.
Analysis
Model Selection Hands out of range • K-nearest Neighbors Algorithm (KNN)
o k = 20 • Support Vector Machine (SVM) • Binary Decision Tree (BDT) • K-means
o Cluster outside trajectory
Overlapping gesture classification • KNN
o k = 8 • SVM
o Reduced to 3 features to relieve overfitting. • Fast Artificial Neural Network (FANN)
o 8 inputs, 7 outputs, 3 layers and 64 hidden neurons.
• Collaboration: With J. Cecilia Wu, a PhD Candidate at UCSB, on her ongoing project “Embodied Music Meditation”.
• Goal: Transform Buddhist Mudras performed by hands to a real-time audio-visual performance.
• Challenge: Two internal problems of the Leap Motion sensor used for hand tracking.
Problem 1: Hands out of range • The Leap Motion will not record any data when a hand is out of
range. We solved this problem by predicting the trajectory outside the range in real-time.
• Dataset: 292 examples, split into training set (80%) and test set (20%).
Problem 2: Overlapping gestures classification • The Leap Motion becomes inconsistent in tracking overlapping
Buddhist Mudras. We used the motion recorded before hand overlapping to classify gestures in real-time.
• Dataset: 435 examples (~60 examples for each mudra), split into training set (70%) and test set (30%).
mudra 1 mudra 2 mudra 3 mudra 4
mudra 5 mudra 6 mudra 7Figure 1: examples of input gestures and output labels
Feature Extraction Hands out of range • Features (285 = 19*15) hand position, hand velocity, palm normal, finger altitude, finger pan over 15 frames. Overlapping gestures classification • Features (8) average of palm normal and finger altitude over 10 frames (1/6 sec).
Figure 2: sensor system setup
Figure 3: num of clusters and num of nearest neighbors selection
Hand out of range
Train Accuracy Test Accuracy Predict TimeSVM 0.49 ± 0.09 0.21 ± 0.08 0.038sBDT 0.80 ± 0.03 0.20 ± 0.03 0.0043sKNN 0.36 ± 0.02 0.30 ± 0.04 0.0069s
Table 1: comparison of different prediction models
Overlapping gesture classification
Hand out of range • KNN model with k=20 gave the best test accuracy of 30% which
is still not great but is better than chance (10%). • The response time of KNN model is 6.9ms, which is good for
real-time performance.
Overlapping gesture classification • Confusion matrix
• KNN model with k=8 has the highest test accuracy of 57%, which improves a lot on chance (14%).
• The response time of KNN model is 3.9ms, which is good for real-time performance.