evolving quick learners - michigan state university

Evolving Quick Learners: Novel Initialization Strategies for Markov Brains

Douglas Kirkpatrick and Tyler Derr

OverviewBackground

Markov Brains

Pre-Evolution & Multi-Task Evolution

Problem

Project Description

Pre-Evolution

Evolving your Ancestors

Results

Background

Markov Brains 9 Input Nodes3 OutputNodes

4 Hidden Nodes

3 Gates

Markov Brains

Idea: We are evolving the logic gates that wire up a set of nodes, thus forming a network

● Genotype: ○ Variable Length 2k to 20k○ Each loci can be an integer value from 0 to 255○ Start codons along genome mark where gates should be read from (similar to

genes in DNA)● Phenotype:

○ A set of N nodes (input, hidden, and output)○ A set of logic gates connecting the nodes (in our case deterministic gates)

ProblemBrains are not quick to evolve

100s of millions of years to evolve in humans

100,000 to 1,000,000 or more generations to evolve in Markov Brains

Can we better initialize the Markov Brains so that they evolve faster?

Pre-EvolutionIf I’m good at hockey, my one cousin is good at football, and a second cousin is good at basketball, it’s likely that my ancestors had some athletic ability

If there’s another relative out there, it’s likely they’re good at rugby or soccer

Similar idea applied to Markov Brains

Multi-Task EvolutionMulti-Task Learning: attempting to learn more than one task/problem at a time

Ex. Learning to drive a car and a boat in the same day

This can lead to better and more generalized understanding ...why?...because it allows the learner to see the commonality (i.e. the underlying tasks) and use them better towards the main tasks

Ex. Learning the general concept of steering left/right will greatly improve one’s ability to drive both a car and a boat

Approach

Approach 2 Experiments

1. Pre-Evolution 2. Multi-Task Evolution

Pre-Evolution ExperimentCore world - Red/Blue Berry

Variants:

- Green (poison) berry - Orange (superfruit) berry- Random Walls in environment

Berry World

World Variants Used 5 Training Worlds:

1. Poison2. Superfruit3. Superfruit & Poison 4. Random Walls & Poison5. Random Walls & Special Food

1 Testing World:

Original Red/Blue Berry World (no enhancements)

Experimental Structure

EVOLVE

EVOLVE

10,000 Generations

Best

1000 GenerationsX100

Ancestor

Results!

SetupMarkov Brains w/ 24 nodes

1000 Generations / 500 Brains for Training Runs

- 20 of each Training world

10,000 Generations / 100 Brains for Testing Runs

Testing Runs compared with Equivalent Randomly Seeded Runs

55 Replicates

For each p in P● Let p have children and grandchildren where they select mates based on

how good at playing chess they are ● chessBest = fitness of best chess playing descendent of p● Let p go back in time (have children…) but this time select mates based

on how good at playing go they are● goBest = fitness of best go playing descendent of p● Fitnessp= avg(chessBest,goBest)

P = selection&mutation(P)

Real World AnalogyExperiment: Evolve your ancestors

P = {random set of people}repeat

until(stoppingCondition)

...kinda sorta ...no, not really

(i.e. when P is finally worthy to be your ancestors)

If you picked A It’s possible, but good luck!

If you picked B

A B(random set of ancestors) || (the evolved ancestors P)?

Question:If your goal in life is to be the best checkers player ever

Would you rather come from

Remember P was evolved to have descendants that are excellent at playing chess and go.

Experiment: Evolve your ancestorsReal World Analogy

Image Source: https://www.floridamemory.com/items/show/134382

(again, most likely not so real)

This could be you!

Evolve P*0 where

Fitness =

Experiment: Evolve your ancestorsInitialize

Random PopP0={p1,p2,...pn}

Evolve P0 where

Fitness =

Best FitnessFound

Initialize

P*0 = Pfinal

Fitness =

Evolve P0 where

Best FitnessFound

Compare?? >=< ??

Problems Used: Random Mapping WorldGenerate(numInputBits, numOutputBits)

Lists all possible input values and randomly selects a corresponding output

Ex. Generate(3,2)000 | 00001 | 00010 | 11011 | 01100 | 11101 | 10110 | 00111 | 10

Fitness = % output correctly predicted based on input values

Image Source: http://2new4.fjcdn.com/pictures/Sparta+chess+my+style+of+chess+_511e16_5692709.jpg

SetupMarkov Brains w/ 16 nodes

Each world: 8 input & 3 output bits (i.e. 256 inputs each with 8 possible outputs)

Training: Multi-Task Evolution on (K-1) worlds - 5,000 Generations & popSize=100

● Fitness of each individual based on evolving 10 mutated copies of it for 25 generations in each (K-1) worlds

Testing: Single-Task Evolution on world K - 10,000 Generations & popSize=100

Results

Relevant XKCD

Results

Shifted

Results

Conclusion Not the stunning success we had hoped for

Experiment 1:● Did not let the training evolution run long enough (similar results to random)

Experiment 2:● The Random Mapping problems were just dummy problems for POC● Need to either modify or use different problems that do have some underlying

similarity to make use of the multi-task evolution initialization● Allowing the ancestors to evolve descendants as a means to evaluate its own

fitness is unneeded -> simply evaluate the fitness of the given individual as is

Future WorkFix the World Setup

- Equivalence for Berry World- Different distribution for Random World

- Ranging the similarity between them

Try Different Worlds

Evolve P*0 where

Fitness =

Future WorkInitialize

Random PopP0={p1,p2,...pn}

Evolve P0 where

Fitness =

Best FitnessFound

Initialize

P*0 = Pfinal

Fitness =

Evolve P0 where

Best FitnessFound

Compare?? >=< ??

Thank You. Questions?

evolving quick learners - michigan state university

Documents