evolutionary computation applied to melody generation

Evolutionary Computation Applied to MelodyGeneration

Matt D. Johnson

December 5, 2003

AbstractIn recent years, the personal computer has become an integral component in the

typesetting and management of various types of music. However, the computer iscapable of serving as more than just a typesetting and data management tool. Thispaper explores the ability of a computer to generate and arrange four part vocalharmony in the style of church hymnody. The research presented here involves theuse of an evolutionary algorithm to generate a melody. The resulting melody isthen arranged into four parts using a decision tree for assigning chords. The resultis an application that produces unique and pleasing music suitably arranged forSoprano, Alto, Tenor, and Bass.

Contents1 Introduction 2

2 Related Work 32.1 Interactive Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . 32.2 Autonomous Systems . . . . . . . . . . . . . . . . . . . . . . . . . . 32.3 Rule Based Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

3 Research Methodology 63.1 Problem Size . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63.2 Problem Simplification . . . . . . . . . . . . . . . . . . . . . . . . . 63.3 Problem Representation . . . . . . . . . . . . . . . . . . . . . . . . . 73.4 Evolutionary Cycle . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

3.4.1 Initialize the Population . . . . . . . . . . . . . . . . . . . . 83.4.2 Terminating Condition . . . . . . . . . . . . . . . . . . . . . 83.4.3 Selection of Parents . . . . . . . . . . . . . . . . . . . . . . . 83.4.4 Reproduction . . . . . . . . . . . . . . . . . . . . . . . . . . 93.4.5 Mutation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93.4.6 Rhythm Correction . . . . . . . . . . . . . . . . . . . . . . . 93.4.7 Competition . . . . . . . . . . . . . . . . . . . . . . . . . . 10

3.5 Fitness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103.6 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

1

4 Conclusion 15

KeywordsEvolutionary Computation, Evolutionary Algorithm, Artificial Intelligence, Music Gen-eration, Melody Generation, Computer Generated Music, Genetic Algorithm, FitnessBottleneck

1 IntroductionIn recent years, personal computers have become tools used to store and typeset sheetmusic. There is currently research underway which will hopefully lead to computerapplications that are capable of generating and arranging music as well as a humanbeing can. Section 2 of this document presents a brief overview of a few such researchprojects.

The research methodology presented in section 3 of this paper presents an evolu-tionary algorithm which generates a melody in the traditional style of church hymnody.The resulting melody is in the soprano range. Alto, Tenor, and Bass parts are generatedto go along with the melody using CAVM, a tool that automatically adds Alto, Tenor,and Bass parts to an existing melody [4].

2

2 Related WorkEvolutionary programming is a powerful tool which has been used by a number ofresearchers in the field of computer generated music. Across the board, it seems thatthe greatest challenge to researchers in this community is the fitness function for theirevolutionary systems. The authors of [8] categorize computer generated music researchaccording to the fitness function used in that particular method. A number of thosecategories are used here.

2.1 Interactive SystemsThe fitness function in an interactive system is a human being. Every generation createdby the evolutionary program must be painstakingly evaluated by hand. This creates a“fitness bottleneck” [1]. However, it practically gaurantees the patient user a computergenerated melody that is pleasing to that individual.

A system called “Variations” is presented by Bruce L. Jacob [3]. Jacob chose toconduct his experiments at the level of phrases and motives instead of notes; typically,computer generated music is implemented at the note level. This system uses threemodules, namely the ear, composer, and arranger. Each of these modules either uses agenetic algorithm, or was developed with a genetic algorithm.

Composing with “Variations” requires a human operator to define a number ofmotives which will be used as the basis for the musical composition. [3]. Phrasesare developed by the composer module, which performs recombination and variationon the original motives. The composer module refers to the ear module to determinewhether a given phrase is acceptable. Once a number of accepted phrases are created,the arranger module will put the phrases together and wait for feedback from a humanevaluator. The arranger module will continue to work with the human evaluator untilthe program terminates.

2.2 Autonomous SystemsIn a typical EA, the fitness function is constant, while the population evolves over timeto become more fit. Autonomous systems are different in that both the population andthe fitness function evolve [8].

One of the most interesting pieces of literature uncovered in this research is a paperentitled “Frankensteinian Methods for Evolutionary Music Composition” [2]. In thispaper, Gregory begins by presenting an extensive overview of a number of differentmusic composition projects. Throughout the overview, references to Frankenstien areused to illustrate various points. The paper climaxes in section four when the authorpresents his evolutionary ideas for generating music.

In the “Frankensteinian” approach, both the individual and the environment coe-volve. According to Gregory, this relationship is similar to the relationship betweenFrankenstein and his monster. Frankenstein and his monster each contributed to theothers environment, so they evolved together based on the other.

Gregory presents two types of individuals in section 4.2 of his paper, “Coevolvinghopeful singers and music critcs”. The female individual represents the evolving en-

3

vironment and choses the males, which represent the singers. The female maintainsa note transistion table. This table indicates what type of transitions she expects andwith what frequency. The table is initialized with note transitions collected from simplefolk-tune melodies. Over time, the table can change in response to what the female ob-serves in the male singers. This creates the changing environment. Males in the systemstart out with randomly generated melodies and evolve based on the environment.

2.3 Rule Based SystemsThe rule based system uses a fitness function which encodes a set of rules. The rulesmust be built into the system based on the authors musical knowledge [8].

George Papadopoulos and Geraint Wiggins [7] present a genetic algorithm for gen-erating jazz melodies based on an input chord progression. Their algorithm is madedistinctive by the following characteristics:

1. The algorithmic fitness function described in [7] calculates the weighted sum ofa number of distinct characteristics of the chromosome. This approach avoidsthe “fitness bottleneck” described by John A. Biles [1].

2. Problem specific genetic operators allow this system to converge to a high fitnessrelativley quickly.

3. The representation of the melody is based on the scale degree of a note, as op-posed to the traditional binary encoding. This allows for greater readability andmore problem specific operators.

The paper concludes by saying that the resulting system frequently generates “in-teresting” patterns, and also enumerates some extensions which could lead to morehuman-like jazz melodies.

A genetic algorithm for harmonising chorale melodies is presented in “Evolution-ary Methods for Musical Composition” [9]. Note representation is based on standardwestern music syntax. Information such as the key signature and time signature isstored. For every note, pitch is expressed in terms of scale degree and its duration is aninteger; another integer is used to indicate the octave the note occurs in. The absolutepitch of the note is not stored.

The genetic algorithm presented in [9] makes use of several domain specific oper-ators. One such operator is named “Splice” and is a traditional one point crossover. Aunique operator in this implementation is the “PhraseEnd” operator. The “PhraseEnd”operator mutates the end of a phrase such that it ends with a chord in root position.

Two types of fitness functions are used in this genetic algorithm. One fitness func-tion evaluates individual voices, and tends to favor movement in a consonant direction.The fitness function also leans against large jumps in the voice. The second fitnessfunction considers the relationship between voices, and tends to avoid certain types ofparallel motion and cross voices [9].

The authors of [9] note in their review of this genetic algorithm that the resultsare decent, but certainly not optimal. The domain knowledge encoded in the algorithmallowed for them to acheive the results they got rather quickly - within 300 generations.

4

They end this section of their paper by suggesting that a conventional rule based systemworking in conjunction with one or more genetic algorithms would be a better approachto harmonisation.

5

gc d e f a b gc d e f a b gc d e f a b gc d e f a b

{_______________} {_______________} {_______________} {_______________}Two−Line 2One−Line 1Small 0Great −1

Bass

Tenor

Alto

Soprano

Figure 1: Voice Ranges.

3 Research Methodology

3.1 Problem SizeThe decision to use an evolutionary algorithm to generate a melody is driven by onemain factor - complexity. Consider the following: An average soprano can sing notesin the range from D1 to G2, or 18 different pitches. (See Figure 1 for an illustra-tion of voice ranges.) There are 8 note durations typically found in church hymnody:sixteenth, eighth, quarter, half, whole, dotted eighth, dotted quarter, dotted half. Thenumber of notes found in a typical hymn can range from roughly 20 to 60. (40 onaverage) Given this information, the number of potential melodies can be calculated.

Pitches * Durations = 18 * 8 = 144 = # possible notes(# possible notes)ˆ(melody length) = 144ˆ40

= 2.16 * 10ˆ86 melodies

The large search space makes an EA well suited to tackling this problem.

3.2 Problem SimplificationSince the number of possible melodies is so large, reduction of the search space willmake the problem more manageable. This is done quite handily by acting on twoobservations. First, most melodies stay within the key of the musical piece. Second,the range of most melodies does not exceed one octave. By constraining the melodyto notes within one key (F) and one octave, the number of pitches drops from 18 to 8.For completeness, a rest is included as a pitch, making the number of pitches 9. Thischanges our initial calculation to the following:

Pitches * Durations = 9 * 8 = 72 = # possible notes(# possible notes)ˆ(melody length) = 72ˆ40

= 1.96 * 10ˆ74 melodies

6

Note Duration Integer Usedwhole note 0dotted half 1half 2dotted quarter 3quarter 4dotted eigth 5eigth 6sixteenth 7

Table 1: Note Duration Mapping

Scale Degree Note name in Key of F Integer UsedREST - 0ONE F 1TWO G 2THREE A 3FOUR B flat 4FIVE C 5SIX D 6SEVEN E 7EIGHT F 8

Table 2: Note Degree Mapping

This is still a daunting number of melodies, but it is significantly smaller than thefirst calculation. Additionally, the restrictions placed on the melody will automaticallyproduce a more pleasing sound, because notes outside the key will not occur.

3.3 Problem RepresentationThe note is the building block of music. Therefore, the cornerstone of the representa-tion is a Note structure. The structure consists of a scale degree and a duration. Theduration of a note indicates how long the note will sound. This value is represented asan enumerated integer type. Table 1 illustrates the mapping of a note duration to theunderlying integer used in the implementation. The scale degree of a note indicates itspitch within a given key. For simplicity, every melody generated by this algorithm is inthe key of F. Table 2 illustrates the mapping between the scale degree, the letter nameof the note in the key of F, and the underlying integer used in the implementation. Acomplete melody consists of a vector of notes.

A class called Individual is responsible for storing the melody. In addition tothe melody, an Individual also contains the following functions: Initialize, Crossover,GetFitness, ForceBeats, ChangeOneNoteDegree, ChangeOneNoteLength. These func-

7

tions are used throughout the evolutionary process, and will be explained in later sec-tions.

The controlling class is named Population . The Population class directs the evolu-tionary process and stores all the individuals in an AVL tree [5] based on the fitness ofthat Individual.

3.4 Evolutionary CycleThe evolutionary cycle used is as follows:

Initialize the Population;while(the terminating condition has not been reached){

Select two parents;Reproduction;Mutate the children;Correct the rhythm of the children;Competition;

}

Each of these steps will now be explained in detail.

3.4.1 Initialize the Population

The size of the population is encoded in the Population class and is currently set atfifty individuals. For every member of the population, Population will instantiate anIndividual and call the Individual::Initialize function. The Individual::Initialize func-tion will decide the length of the melody (from 20 to 60 notes) and then generate thatnumber of notes; each note has a randomly generated scale degree and duration. TheIndividual::ForceBeats (see section 3.4.6 for details) function is called after Individ-ual::Initialize is called.

3.4.2 Terminating Condition

A population will evolve until 100,000 generations has been reached, or until the bestindividual in the population has a fitness of at least 30. See section 3.5 for a completedescription of the fitness function.

3.4.3 Selection of Parents

Since the individuals are stored in an AVL Tree based on their fitness, implementationof rank based selection is straight forward. The tree is traversed starting with the mostfit individual, proceding towards the least fit individual. At any point along the traver-sal, the current individual has a twenty percent chance of being selected. Traversalwill continue through the tree, giving every individual along the way a twenty percentchance of selection when it is visited, until one is finally selected. If the traversal fails

8

to select an individual, the most fit individual in the tree will be used. Once an individ-ual is selected, the traversal starts over and a second parent is selected using the samecriteria. It is possible that the same individual will be selected both times.

3.4.4 Reproduction

Reproduction is essentially crossover between the two children, who at this point arejust copies of their parents. One child will call its Individual::Crossover function. TheCrossover function takes as an argument another Individual,which is the second child.To facilitate the discussion of reproduction, the following terminology will be used:

this melody: The melody contained in the Individual whos Crossover function is cur-rently executing.

in melody: The melody contained in the Individual who was passed into the Crossoverfunction.

temp melody: The temporary melody that was created inside the Crossover function.

The Crossover function will randomly select a scale degree and use it as a crossoverpoint. The length of temp melody will be determined to be the length of one of the othertwo melodies, whichever is shorter. The melody temp melody is created by copyingnotes from this melody until the crossover point is hit. Then, in melody will be scanneduntil the crossover point is found. Starting with this crossover point in in melody,notes will be copied from in melody to temp melody until another crossover point isencountered. The algorithm will at this point switch back to this melody for anotherchunk of notes. Thus, temp melody is created by adding sets of notes from the othertwo melodies until the melody is full. (Refer to Figure 2 for an example.) At the endof Individual::Crossover , this melody is reassigned to be the same as temp melody.

3.4.5 Mutation

The only two mutation operators are ChangeOneNoteLength and ChangeOneNoteDe-gree, which are used by both children. ChangeOneNoteLength randomly selects a notein the melody. Then, it either decreases or increases the integer which represents thenote duration. Table 1 shows the note duration to integer mapping. ChangeOneNoteDe-gree operates in exactly the same manner, except is modifies the scale degree insteadof the note duration. Table 2 shows the note length to integer mapping.

3.4.6 Rhythm Correction

During the development stages of this project, the observation was made that therhythm patterns in the melodies were exceptionally difficult and unusual. To correctthis problem, a deterministic function called ForceBeats was introduced. ForceBeatsworks as follows:

Loop through the whole melodySelect the next note

9

Figure 2: Reproduction Crossover.

If the note duration is equal to one beatGo on to the next note.

If the note duration is more than a beatEnsure that the following note or notes do notextend beyond the end of the current count.

If the note duration is less than one beatEnsure that the following note or notes plus thecurrent one have a total duration of one count.

End loop

3.4.7 Competition

Every generation, two individuals are born, and two individuals die. The two individ-uals created are stored in the tree. Then, the two least fit individuals in the tree areterminated.

3.5 FitnessWithout a doubt, the fitness function was the most challenging aspect of this project.The fitness function is a member of the Individual class. The fitness function uses a“Fitness Loop”, which cycles through every note in the melody, checking the relation-

10

ship of the current note with the note which follows it. As the melody is evaluated, thefitness function keeps a running total of “fitness points”.

The following list provides a name for a particular characteristic within the melody,the fitness points awarded for that characteristic, and a brief description. The phrase“next note” is used below to indicate the note which follows the current note in the“Fitness Loop”.

1. SAME NOTE: Fitness Points: 17. The scale degree of the next note has notchanged.

2. ONE STEP: Fitness Points: 17. The scale degree of the next note has gone upor down one step.

3. ONE THIRD: Fitness Points: 15. The scale degree of the next note has gone upor down two steps.

4. ONE FOURTH: Fitness Points: 12. The scale degree of the next note has goneup or down three steps.

5. ONE FIFTH: Fitness Points: 10. The scale degree of the next note has gone upor down four steps.

6. OVER FIFTH: Fitness Points: -25. The scale degree of the next note is greaterthan four steps away.

7. FOUR SEVEN: Fitness Points: -25. The current note is scale degree four andthe next note is scale degree seven.

8. SIXTEENTH NOTE: Fitness Points: -10. The current note is a sixteenth note.

9. DRASTIC DURATION CHANGE: Fitness Points: -20. The duration changebetween the current note and the next note is more than four steps in table 1.

10. BEGIN TONIC: Fitness Points: 50. The melody begins with the tonic note(scale degree 1).

11. END TONIC: Fitness Points: 50. The melody ends in the tonic note (scaledegree 1).

“Fitness points” are awarded and stored in a local integer variable as the “FitnessLoop” executes. The function returns the value of the fitness points divided by thenumber of notes. In the event that the number of fitness points happens to be negative,the function will return -100.

When the population is first initialized, the best individuals fitness is typicallyaround zero. In 1000 generations, the fitness of the best individual will usually achieveat least 15. Sometimes, a fitness of 18 or better can be achieved in that time frame. Inother situations, a fitness of 18 is never acheived. Figure 3 shows the fitness of the bestindividual in the tree every 1000 generations for a particular run. Figure 4 illustratesthe average fitness of the population every 1000 generations.

11

0 1 2 3 4 5 6 74

6

8

10

12

14

16

18

Figure 3: Best Fitness.

12

0 1 2 3 4 5 6 7−50

−40

−30

−20

−10

0

10

20

Figure 4: Average Fitness.

Figure 5: A Melody with a fitness of 14.


13


Figure 8: A Melody with a fitness of 17 arranged into four parts.

3.6 ResultsIn the early stages of development, the melodies generated were quite dissapointing.However, after fine tuning the fitness function, juggling parameters, and determin-istically correcting rhythm patterns, the resulting melodies are quite nice. The bestmelodies seem to be in the fitness range of 14 to 20, depending on how much excite-ment is desired in the melody.

Melodies on the low end of this range are quite unique and interesting. Thesemelodies are also more difficult to sing, and may not sound as nice. Figure 5 is a goodexample of this type of melody. (All music shown in this document was typeset byLilypond [6])

Melodies with a fitness greater than 19 start to exhibit similarity between eachother. The algorithm terminates before the one “perfect” individual is found, but itdoes appear that given infinite time that “perfect” individual would be a rather boringmelody. Figure 9 is actually one of the more interesting super high fitness melodies.Other individuals with a fitness of 20 have been less interesting.


14

The best individuals are in the fitness range of 16 to 18. They exhibit uniqueness,are pleasant to listen to and tend to be easy to sing. Figure 7 is an excellent representa-tive of the top notch individuals produced by the EA. Notice how the part moves aroundand changes frequently, but has few irratic jumps. Figure 8 uses the same melody andpresents alto, tenor, and bass to accompany the melody. The alto, tenor, and bass linesare arranged by CAVM [4].

4 ConclusionWithout a doubt, a simple evolutionary algorithm is capable of generating very nicemelodies. Further development including musical modeling would lead to even bettermelodies. More sophisticated and problem specific genetic operators would also likelyimprove the results.

Computers are inherently good at doing any type of work which requires crunchingnumbers or doing logic. Their biggest weekness lies in areas which involve feelingsand emotions, such as art and music. The research presented here, as well as ongoingresearch in the computer generated music community, leads this author to concludethat there may come a day in the near future in which computers can do more than justcrunch numbers.

Over time, Computer Science methodologies will continue to develop. Eventually,these methods will converge to mimic the creative nature of the human brain. Imaginea computer that can compose with the anger of Wagner, to do so in a moment, andmake no type o’s in the process. Artificial Intelligence will grow until it encapsulatesthe nature and production of human feelings into ones and zeros. At that point, we willhave computers that can not only crunch numbers, but can also express emotions.

References[1] BILES, J. Genjam: A genetic algorithm for generating jazz solos, 1994.

[2] GREGORY, P. T. Frankensteinian methods for evolutionary music composition.

[3] JACOB, B. Composing with genetic algorithms, 1995.

[4] JOHNSON, M. D., AND WILKERSON, R. W. Computerized arrangement of vo-cal music. In Intelligent Engineering Systems Through Artificial Neural NetworksVolume II (2001).

[5] KARAS, W. Code: Abstract avl tree template - available in the public domain.

[6] LILYPOND. http://lilypond.org/web.

[7] PAPADOPOULOS, G., AND WIGGINS, G. A genetic algorithm for the generationof jazz melodies.

15

[8] SANTOS, A., ARCAY, B., DORADO, J., ROMERO, J., AND RODRIGUEZ, J.Evolutionary computation systems for musical composition. In Proceedings ofAcoustics and Music: Theory and Applications (AMTA 2000). vol 1. pp 97-102.ISBN:960-8052-23-8. (2000).

[9] WIGGINS, G., PAPADOPOULOS, G., PHON-AMNUAISUK, S., AND TUSON, A.Evolutionary methods for musical composition, 1998.

16

evolutionary computation applied to melody generation

Documents