nyai #7 - top-down vs. bottom-up computational creativity by dr. cole d. ingraham
TRANSCRIPT
NYAI #7 (SPEAKER SERIES): Data Science to Operationalize
Machine Learning (Matthew Russell) & Computational Creativity
(Dr. Cole D. Ingraham DMA)
Top-down vs. Bottom-upComputational Creativity
Dr. Cole D. IngrahamNYAI Nov. 22, 2016
About Me
• originally from Santa Clara, CA
• DMA Music Composition
• currently working as lead developer at Amper Music (www.ampermusic.com)
• www.coleingraham.com
Abstract
• what does computational creativity mean, and what are some approaches?
• what are some examples of these approaches?
• how to know the best way a particular problem could be solved?
• primarily concerned with choosing an appropriate approach, less about implementation
Defining Creativity
• Creativity: the use of the imagination or original ideas, especially in the production of an artistic work.
• Artistic: having or revealing natural creative skill; aesthetically pleasing
• Aesthetic: a set of principles underlying and guiding the work of a particular artist or artistic movement
• very circular, difficult to concretely define
Defining Creativity
• my personal requirements:
• originality/novelty
• having some unified set of guiding principals
Top-Down Approach
• finding an answer that:
• is novel and useful (either for the individual or for society)
• demands that we reject ideas we had previously accepted
• results from intense motivation and persistence
• comes from clarifying a problem that was originally vague
https://en.wikipedia.org/wiki/Computational_creativity#Defining_creativity_in_computational_terms
Bottom-Up Approach
• artificial neural networks
• machine learning
• data, data, and more data
https://en.wikipedia.org/wiki/Computational_creativity#Defining_creativity_in_computational_terms
Comparing Approaches• Top-Down
• primarily code driven
• dependent on defining structure
• requires considerable development time to be effective
• very “hands on” (things only improve as the code base improves)
• Bottom-Up
• primarily data driven
• heavy use of statistical analysis
• requires considerable amounts of data to be effective
• very “hands off” (run the algorithm on the data and wait)
• 5
Comparing Approaches• Top-Down
• you define the structure of your program
• Bottom-Up
• you learn the structure of your program from analyzing data
Generation Vs. Analysis• Generation
• create something that does not exist
• novel output
• example use: music composition
• Analysis
• extract information from something that exists
• generalization of the input
• example use: music suggestion (Spotify, Pandora, etc.)
Generation & Analysis
• the creation of something new can/should be informed by analysis of previous work
• in order to analyze previous work, the work must already exist
• ... chicken and egg
Generation by Analysis
• analysis produces generalizations/averages of all input
• this can lead to largely homogenous output
Generation, Then Analysis
• with some target in mind, generate output, then analyze it and see how close you got
• heuristic “guess and check” with no generation-time feedback
Generation, No Analysis
• create output without analyzing anything before or after
• you must know what you are going to get
• ... or you must not care what you are going to get
How to know what approach use?
It depends!
Generation / Analysis
• examples:
• compile-time code optimization: generate, analyze, repeat n times
• absolute realism: analyze, then generate (conceptually)
• generative abstract art: generate, do not analyze (sometimes)
Novelty
• introducing variance into the programs output
• important considerations:
• how much variance?
• how to determine what can be varied?
• how much/little control over the variance you want/need?
How Much Variance?
• largely subjective and project dependent
• beware of probability-driven decisions:
• probabilities say how frequently something happens
• probabilities do not inherently say why something should happen
What Can Be Varied?
• largely subjective and project dependent
• example - text generation:
• synonyms can vary a sentence without fundamentally changing its meaning
• sentence structure and grammar are far less flexible
How to Control Variance?
• largely subjective and project dependent (sensing a theme?)
• some options:
• explicitly defined seeding and propagation of pseudo-randomness
• working with finite quantities of input parameters
A Note on Randomness
• nondeterministic code is a nightmare to QA
• interaction between many deterministic operations can create seemingly random outcome
• (my opinion) it’s a better use of time to work on deterministically defined variance than it is to weed out undesired results from nondeterministic code
In the News
• neural network based approach (TensorFlow)
• uses a large training set
• limited by availability of data (not a problem for Google)
Wavenet(Google DeepMind)
• input: raw audio
• output: raw audio
• aimed at improving speech synthesis (very good at this)
• also used for music generation (less good at this)
Wavenet“Since WaveNets can be used to model any audio signal, we thought it would also be fun to try to generate music. Unlike the TTS experiments, we didn’t condition the networks on an input sequence telling it what to play (such as a musical score); instead, we simply let it generate whatever it wanted to. When we trained it on a dataset of classical piano music, it produced fascinating samples [...]”
https://deepmind.com/blog/wavenet-generative-model-raw-audio/
Wavenet• “trained it on a dataset of classical piano music”
• single instrument, single genre
• “we didn’t condition the networks on an input sequence telling it what to play (such as a musical score)”• no large scale structural awareness
• “let it generate whatever it wanted to”• no external input
...what about top-down news?
Top-Down News?
• none that I’m aware of
• not a buzzword like neural networks
• (my opinion): everyone uses a top-down approach to some degree, they just don’t feel the need to talk about it
Why am I talking about it then?• to emphasize that:
• AI is more than just data science
• defining the structure of your problem is integral to finding its solution
• there are many more valid approaches to creative AI than what gets all the attention
• one should use the right tool for the right job
• some balance between top-down and bottom-up is often the right choice
Project Example: Amper Music• Goal: create an AI music composition platform that lets users create
personalized, professional quality music instantly with no experience required.
• Requirements:
• speed: it must be fast!
• quality: it must be believable!
• control: it must be collaborative!
Requirement: Speed
• neural networks are s l o w
• hand tuned algorithms can be fast
• lookup operations are very fast
Requirement: Quality
• neural networks can be very accurate, given a training set of sufficient size and quality
• most musical training (theory, performance practice) can be defined in code
• we:
• don’t (currently) have access to an appropriate training set
• do have a team of developers who are also highly trained professional musicians
Requirement: Control• neural networks are “black boxes that just do what they do”
• defining music as a hierarchical structure offers handles to various aspects directly
• multiple levels of control:
• intuitive enough for the non-musician to use effectively
• powerful enough for the professional musician to use without feeling limited
Our Solution• define various levels of musical structure in a declarative manner
• use machine learning / neural networks to generate data for all defined structural levels offline, as data becomes available, that can be used to augment runtime decisions
• structure the program in a way that can be easily scaled with machine learning, but works without it
• keep the “black box” parts of the program as minimal and confined as possible
Tooling
• runtime: Haskell
• offline: Haskell, Python, anything else (it’s offline!)
Tooling• Why Haskell?
• compiled (fast)
• functional language (great for music and AI, like LISP)
• statically typed (safe, easy to maintain and refactor)
• Why Python?
• availability of libraries (TensorFlow as one example)
• scripting layer on top of C(++) (easier to use, still fast)
• used extensively in data science
Tooling• Why Haskell?
• compiled to machine code (fast)
• functional language
• statically typed (safe, easy to maintain and refactor)
• Why not Python (for runtime)?
• interpreted / compiled to bytecode
• not a functional language (I wanted to use a functional language)
• dynamically typed (can be a nightmare to maintain, compared static typing)
Thank you
Social Event: THE STOREHOUSE
69 West 23rd, New York, NY 10010 (2nd Floor)