Introduction - Literature Review - Methodology - Results - Conclusion
Problem Statement and Goal
•Genetic algorithms (GAs) to create music– With programmatic fitness, ineffective music– With human input, fitness bottleneck
•Way to solve fitness bottleneck?
•Creativity/collaboration for musical novices
•Potential solution for GA fitness bottlenecks
•Potential use for crowdsourcing
Relevance and Significance
Introduction - Literature Review - Methodology - Results - Conclusion
Research Question
Q: “When music that is created by a GA trained by a crowdsourced group is compared to music created by a GA trained by a small group, is the crowdsourced music more effective?”
A: By running two instances of the same musical GA with those two training conditions, then having composers and musical laypeople review the results, the song effectiveness was about the same overall.
Introduction - Literature Review - Methodology - Results - Conclusion
Computer Music
•Composition, performance, analysis, sound processing, sound production
•Search problem with no optimal solution
•GA suitability
•First, with programmatic fitness only
•Next, with human evaluation as fitness
•Recurring bottleneck problem
Introduction - Literature Review - Methodology - Results - Conclusion 5
Fitness Bottleneck and Workarounds
•GenJam – Biles (1994)
•Audioserve - Yee-King (2000)
•SBEAT3 - Unemi (2002)
•Constructive Adaptive User Interface (CAUI) - Legaspi et al. (2007)
•Gartland-Jones and Copley (2003)
•Unehara and Onisawa (2003)
•Composition, Feedback, and Evolution Framework – Fu et al. (2009)
identified the problem
attempted a solution
Introduction - Literature Review - Methodology - Results - Conclusion 6
Crowdsourcing
•Outsourcing to collective online intelligence
•Pros- around-the-clock- inexpensive- fast- wisdom of crowd
•Marketplaces such as
• Cons
- untrustworthiness
- lack of skill
- ethics of outsourcing
Introduction - Literature Review - Methodology - Results - Conclusion 7
Darwin Tunes
•Crowdsourced compositional GA – MacCallum and Leroi
•Evolectronica: Survival of the Funkiest
•641 generations of evolution
•Not mTurk, not a formalized study
Music Information Retrieval Evaluation eXchange (MIREX)
•Urbano, Morato, Marrero, & Martin (2010) used mTurk
•Crowdsourced ratings of music similarity
•expert-level results on 2,810 rankings for $70.25
Introduction - Literature Review - Methodology - Results - Conclusion
GA choice - Melodycomposition
•Considered code from VARIATIONS, master’s thesis, Spieldose, and CAUI
•Melodycomposition – Craane on code.google.com
•Uses Java Genetic Algorithms Package (JGAP)
•Modifications:– 2 melodies (SA)– Additional fitness– Interaction with mTurk– Removal of GUI– Database persistence– # generations (11 & 200)
[F#:7:QUARTER][A#:4:QUARTER][F#:6:EIGTH]
Introduction - Literature Review - Methodology - Results - Conclusion
Genre and Programmatic Fitness
•Chorale-like genre– Instrumental– 2-part (soprano/bass)
•List of fitness guidelines in addition to human ratings– After Large Skip– Consecutive Skips – Global Pitch Distribution– Interval– Parallel Motion– Proportion Notes/Rests– Range– Repeating Notes– Scale– Strong Beats
Introduction - Literature Review - Methodology - Results - Conclusion
Prototype and Task Setup
•Modification of melodycomposition
•Interaction with mTurk Java API
•Webpage for participants, with php and JavaScript to appear on mTurk
•MySQL database and Ubuntu server
•IRB approval from Nova
•IRB approval from ETSU
Generate songs
Post mTurk HITs
Send results to GA
Calculate fitness
Selection and mutation
Introduction - Literature Review - Methodology - Results - Conclusion 11
Training GAs
Control Test
Generations 11 200
Participants 11 154*
Listening Tasks 275 5,000
Songs 825 15,000
Recruitment
Consent
Introduction - Literature Review - Methodology - Results - Conclusion 12
Evaluation by Reviewers and Composers
Reviewers Composers
Participants 8 8
Songs 10 10
Recruitment
Consent
Instructions
Ratings Like?Artistically Effective?Similar?
Interesting?Creative?Artistically Effective?Chorale-like?
Questions What emotions?What was memorable?
What was memorable?What were shortcomings?
Introduction - Literature Review - Methodology - Results - Conclusion 13
Music
•Small control group songs: 1 2 3 4 5
•Large test group songs: 6 7 8 9 10
Reviewers said:
curiosity, suspense, dissonance, ballet, storytelling, syncopation, mystery, anxiety, awkward rhythms, and
too much distance between the bass and soprano
Reviewers said:
darkness, lack of flow, mystery, curiosity, happiness, ballads, major 3rds, and the need
for tempo variance
Introduction - Literature Review - Methodology - Results - Conclusion 14
Difference between Reviewers’/Composers’ Test minus Control Effectiveness
t test N Mean St. Dev. Min Q1 Median Q3 Max
Combined 16 .015 6.19 -8.00 -3.75
-0.67 0.75 19.00
Reviewers 8 .013 3.77 -8.00 -4.00
-2.50 1.75 19.00
Composers 8 .017 8.24 -4.67 -1.17
-0.33 0.00 8.67
Introduction - Literature Review - Methodology - Results - Conclusion 15
Combined Reviewer Ratings of All Music
Introduction - Literature Review - Methodology - Results - Conclusion 16
Combined Composer Ratings of All Music
Introduction - Literature Review - Methodology - Results - Conclusion 17
Reviewers’/Composers’ Artistic Effectiveness Ratings
Paired t test N
Mean St.Dev. SE Mean
Reviewers 10 35.50 4.65 1.47Composers 10 26.60 4.67 1.48Difference 10 8.90 4.65 1.47
Introduction - Literature Review - Methodology - Results - Conclusion
Implications
Recommendations
•Test music slightly better overall, but not statically significant
•Null hypothesis not rejected
•Fine-tune rules in programmatic fitness function
•Change rules weights
•Avoid premature convergence (mutation rate?)
•Compare to 200 generations of programmatic fitness only
•Use Turkit
•Use preference judgments instead of best/middle/worst
•Use voting or limit HITs to one-per-worker
18