powerpoint presentation - research project 2015

19
Mathematical model determining the optimal parameters for the highest possible learning efficiency in Artificial Neural Networks Sophia Kioulaphides The Bronx High School of Science Sophia Kioulaphides The Bronx High School of Science

Upload: sophia-kioulaphides

Post on 31-Jul-2015

254 views

Category:

Science


1 download

TRANSCRIPT

Page 1: PowerPoint Presentation - Research Project 2015

Mathematical model determining the optimal parameters for the highest possible learning

efficiency in Artificial Neural Networks

Sophia Kioulaphides The Bronx High School of ScienceSophia Kioulaphides The Bronx High School of Science

Page 2: PowerPoint Presentation - Research Project 2015

BACKGROUND The human brain and its capabilities are still unsolved mysteries.

Scientists conceptualize technologies that will replicate the behavior of the human brain.

Various companies, including Facebook, have shown a keen interest in understanding the thinking processes of the brain.

Computers built on a system of neurons can learn and remember as efficiently and quickly as the human brain. Neurons, the “building blocks” of the brain are connected by

synapses and perform different functions together through a “chain-like process”.

Neurons fire action potentials (electric pulses), sending messages and causing the phenomena of learning and memory retrieval. This process is similar to how a metal wire conducts a

current. This allows the Artificial Neural Network (ANN) to be

formed.

Introduction Methodology Results Discussion

Page 3: PowerPoint Presentation - Research Project 2015

LITERATURE REVIEW The first ANN consisted of only one neuron.

The model described mathematically human neural behavior. The individual properties of neurons are also collective! The ANN behaves very much like a ferromagnetic system,

which is how metals become magnets. One kind of ferromagnetic system is the Ising spin system or Ising Model. The state of the Ising Model units is either “up” or “down”,

similar to how a neuron either completely fires or doesn’t fire at all.

When artificial neurons are connected, they tend towards the most ordered state, similar to how biological neurons tend to a memory.

There has to be a low level of disorder to maximize learning.

Introduction Methodology Results Discussion

Page 4: PowerPoint Presentation - Research Project 2015

Introduction Methodology Results Discussion

LINK BETWEEN BIOLOGICAL AND PHYSICAL MODELS

Biological Physical

The membrane potential optimizes towards the stable

global minimum of the brain.

The energy function describes optimization in the brain in mathematical terms.

Order: The extracellular electric fields create order in the brain and help with

memory formation.

Order: The external magnetic field affects

cooperative magnetism in a magnetic system.

Disorder: The weights of the inputs of neurons are

altered and changes how the output is reached.

Disorder: The pseudo-temperature causes the

system to tend to the most stable unit.

Page 5: PowerPoint Presentation - Research Project 2015

RESEARCH PROBLEMThe fundamental question of my research project was, “Under what conditions does the Ising Model (a model for the ANN) tend to the so-called global minimum, or what we would commonly call a memory, the fastest?” In other words, what do we need to do in order to maximize learning?

Those optimal conditions are necessary for the ANN to retain a particular memory the fastest, to store the maximum amount of information in the most efficient manner.

The continuing study of those conditions will perpetuate the use and further development of ANN computers that aim to operate with the same efficiency as the human brain.

Introduction Methodology Results Discussion

Page 6: PowerPoint Presentation - Research Project 2015

RESEARCH HYPOTHESIS

Now, for some technical talk:

The pseudo-temperatures, denoted by T, optimize the system when they are below the Curie Point, represented by TC (about 2.27 K). However, learning is faster at temperatures on the higher end of 0 K to 2.27 K. So it is most likely that the fastest learning will occur near 2.27 K, but once the pseudo-temperature goes above this value, the system will not tend towards any particular value.

Introduction Methodology Results Discussion

Page 7: PowerPoint Presentation - Research Project 2015

SIGNIFICANCE The theoretical limit for efficient learning had to be found. Once that is known, it gives engineers a head start in

creating a new generation of neural computers. These computers will be able to learn, make decisions,

and remember just like humans. Humans make use of these abilities to perform everyday

tasks such as discerning handwriting, and even to save lives by detecting the presence of a bomb.

In addition to studying the functions of a healthy brain, further studies of the brain that is affected by neurodegenerative disorders can be pursued, possibly leading to a cure.

Introduction Methodology Results Discussion

Page 8: PowerPoint Presentation - Research Project 2015

HOW TO MODEL A LEARNING PROCESS Recent studies dealt with the biology of neural networks.

They experimentally showed how neural networks respond to chemical impulses, such as drugs; when drugs are profusely consumed, the firing rate of neurons is rapidly increased.

We need to break the complex neural network down to the simplest model that still retains all the properties of neurons. If we represent the brain as a simple computer, we see its

basic binary function, where “neurons” are either firing or not firing at all; in other words, the familiar 0 vs. 1 computing relationship.

A neural network stores information, and the maximum storage occurs in an ordered system.

Parameters of the ANN will have values that optimize learning capabilities and maximize the phenomena of learning.

Introduction Methodology Results Discussion

Page 9: PowerPoint Presentation - Research Project 2015

Introduction Methodology Results Discussion

The Energy Function Mathematically shows optimization.

The variable J represents the strength of the connection between two neurons Si and Sj.

The variable h or H represents the external magnetic field that is acting on one particular “neuron”.

Ok. Here comes the math.

Page 10: PowerPoint Presentation - Research Project 2015

Introduction Methodology Results Discussion

The strength of the synapses, J N is the number of neurons in the system μ is the number from 1 to p assigned to a specific

memory. The magnetic field, h

J is the strength of the synapses. Θi is the action potential that the system has to

overcome to fire a message.

ORDER PARAMETERS

Page 11: PowerPoint Presentation - Research Project 2015

The entropy, S, is the degree of disorder in the system.

kB is the Boltzmann Constant, which is the relationship

between the temperature and energy of one neuron.

n, just like N, is the number of neurons in the system.

The temperature, T, is defined as the reciprocal of the derivative of the system’s entropy with respect to the system’s total neural energy.

The Boltzmann Distribution, β, is the level of disorder that responds to an increase in energy.

Introduction Methodology Results Discussion

DISORDER PARAMETER

Page 12: PowerPoint Presentation - Research Project 2015

In short, learning is order; entropy is disorder.

Learning is never pure There is never perfect order. Order and disorder are connected because

they coexist. The order parameters show how we can

come to the most ordered state of our mental processes.

Disorder arises because some neurons do not connect entirely, when the connection on which the message is being transmitted is faulty.

Introduction Methodology Results Discussion

Back to English:

Page 13: PowerPoint Presentation - Research Project 2015

The Metropolis-Hastings Algorithm A type of Monte Carlo

algorithm. Shows that if the

pseudo-temperature is lowered slowly, then thermal equilibrium is reached.

The Boltzmann Distribution chose the pseudo-temperatures.

Introduction Methodology Results Discussion

OPTIMIZING ALGORITHM

Page 14: PowerPoint Presentation - Research Project 2015

THE SIMPLEST MODEL The smallest dimensions of the ANN that still capture

essential neural features 3x3 matrix. The configurations of the matrix have to do with how many

“neurons” are completely firing or not firing at all and where they are located in the matrix.

Each configuration was given a number from 1 to 2N, (N is the number of “neurons”—in this case, 9). The numbers were 1 to 512.

J was held constant at -0.7. H was held constant at 1. T was increased by increments of 0.03 in order to get a

steady curve.

Introduction Methodology Results Discussion

“Everything should be made as simple as possible, but not

simpler.” –Albert Einstein

Page 15: PowerPoint Presentation - Research Project 2015

Introduction Methodology Results Discussion

1 45 89 133177221265309353397441485

-2

-1.5

-1

-0.5

0

0.5

1

1.5

Series1

Configuration #

Tota

l N

eura

l Energ

y (

E)

1 47 93 139185231277323369415461507

-1.5

-1

-0.5

0

0.5

1

1.5

2

Series1

Configuration #

Tota

l N

eura

l Energ

y (

E)

Page 16: PowerPoint Presentation - Research Project 2015

Introduction Methodology Results Discussion

1 815 22 29 36 43 50 57 64 71 78

0

2

4

6

8

10

12

14

16

18

20

Series1

Pseudo Temperature #

# o

f Ste

ps

1 713 19 25 31 37 43 49 55 61 67 73 79

0

0.5

1

1.5

2

2.5

3

3.5

4

Series1

Pseudo Temperature #

Fre

quency

Page 17: PowerPoint Presentation - Research Project 2015

Introduction Methodology Results Discussion

WHAT IS NEW? The goal of this project was to create guidelines that would help

scientists and engineers find the highest possible learning efficiency of a neural network.

There are optimal parameters for learning efficiency; we just need to find them. I have developed a theoretical model, so now it is the time to move on to the application stage.

There is a new understanding of how quickly a memory can be retrieved; if we know that, we can improve the retrieval mechanism.

The simplest, smallest model that behaves the same way as a biological neural network has been created.

This model was able to learn and retrieve memories, both of which are phenomena characteristic of the human brain.

Page 18: PowerPoint Presentation - Research Project 2015

WHAT REMAINS TO BE LEARNED? The small 3x3 model only captures general

features of the neuron.

A more sophisticated model would capture the detailed properties of a neuron.

What else would you like to know?

Introduction Methodology Results Discussion

Page 19: PowerPoint Presentation - Research Project 2015

References

[1] Markoff, J. (2013, Dec 28). Brainlike Computers, Learning From Experience. The New York Times. Retrieved from http://mobile.nytimes.com/2013/12/29/science/brainlike-computers-learning-from-experience.html?_r=1 [2] Neural Networks for Machine Learning. Coursera. Retrieved from https://www.coursera.org/course/neuralnets [3] Obama, B. (2013, Apr 2). Remarks by the President on the BRAIN Initiative and American Innovation. The White House. Retrieved from http://www.whitehouse.gov/the-press-office/2013/04/02/remarks-president-brain-initiative-and-american-innovation [4] Cumming, J.G. (2010, Dec 23). Let’s build Babbage’s ultimate mechanical computer. NewScientist Opinion. (2791). Retrieved from http://www.newscientist.com/article/mg20827915.500-lets-build-babbages-ultimate-mechanical-computer.html#.VCDuV0sq37U [5] Conventional Computer Organization. Retrieved from http://people.cs.clemson.edu/~turner/courses/cs428/current/webct/content/pz/ch2/ch2_1.html [6] Computer Parts: How do quantum computers differ from conventional computers? Discovery. Retrieved from http://curiosity.discovery.com/question/quantum-differ-conventional-computers [7] DeGroff, D. and Neelakanta, P.S. (1994, Jan 7). Neural Network Modeling: Statistical Mechanics and Cybernetic Perspectives. Earthweb. Retrieved from http://www.ru.lv/~peter/zinatne/ebooks/Neural_Network_Modeling.pdf [8] Siganos, D. Why neural networks? Retrieved from http://www.doc.ic.ac.uk/~nd/surprise_96/journal/vol1/ds12/article1.html [9] Siegelmann, H.T. and Sontag, E.D. (1991, Jun 17). Turing Computability with Neural Nets. Appl. Math. Lett. 0 (0). Retrieved from http://www.math.rutgers.edu/~sontag/FTP_DIR/aml-turing.pdf [10] (2013, Oct 27). Neuroscientists discover new ‘mini-neural computer’ in the brain. ScienceDaily. Retrieved from http://www.sciencedaily.com/releases/2013/10/131027185027.htm [11] Bushwick, S. (2012, Jan 19). How exactly do neurons pass signals through your nervous system? io9. Retrieved from http://io9.com/5877531/how-exactly-do-neurons-pass-signals-through-your-nervous-system [12] Sompolinsky, H. (1988, Dec). Statistical Mechanics of Neural Networks. Physics Today. Retreived from http://neurophysics.huji.ac.il/sites/default/files/Sompolinsky_PhysicsToday.pdf [13] The biological model: The human brain. (2004). Neural Networks with Java. Retrieved from http://www.nnwj.de/biological-model-human-brain.html [14] Kuhn, Michael. Manual for the implementation of neural networks in MATLAB. Norderstedt: GRIN Verlag GmbH, 2005. Print. [15] Introduction to Neural Networks. Retrieved from http://gandalf.psych.umn.edu/users/kersten/kersten-lab/courses/NeuralNetworksKoreaUF2012/MathematicaNotebooks/Lect_12_Hopfield/Lect_12_Hopfield.nb.pdf [16] Svitil, K. (2011, Feb 2). Neurobiologists Find that Weak Electric Fields in the Brain Help Neurons Fire Together. Caltech. Retrieved from http://www.caltech.edu/content/neurobiologists-find-weak-electrical-fields-brain-help-neurons-fire-together [17] Cooperative Magnetism. Retrieved from http://www.chemie-biologie.uni-siegen.de/ac/hjd/lehre/ws0708/seminar_ws0708/klotz_zusammenfassung_cooperative_magnetism_korr_.pdf [18] Simon, B. The Ising Model. Retrieved from http://math.arizona.edu/~tgk/541/chap1.pdf [19] Joo, J.M. (2006). A Virtual Lab to Visualize the Performance of the Hopfield’s Neural Network for Associative Content-Addressable Memory. Revista de Investigación de Física. 9 (pp. 36-45). Retrieved from http://www.rif-fisica.org/images/4/40/36-45_Montenegro.pdf [20] Hopfield, J.J. (1982, Jan 15). Neural networks and physical systems with emergent collective computational abilities. Proceedings of the National Academy of Sciences of the United States of America. 79 (pp. 2554-2558). Retrieved from http://www.ncbi.nlm.nih.gov/pmc/articles/PMC346238/pdf/pnas00447-0135.pdf[21] Coxworth, B. (2011, Nov 24).“Analogue computer chip mimics brain’s neural function.” Gizmag. Retrieved from http://www.gizmag.com/computer-chip-mimics-neurons/20608/[22] Erickson, C. Neurons and Neurotransmitters: The “Brains” of the Nervous System. The University of Texas. Retrieved from http://www.utexas.edu/research/asrec/neuron.html [23] Hathaway, B. (2013, Feb 20). Human cognition depends upon slow-firing neurons, Yale researchers find. YaleNews. Retrieved from http://news.yale.edu/2013/02/20/human-cognition-depends-upon-slow-firing-neurons-yale-researchers-find [24] Shonkwiler, Ronald W. and Mendivil, Franklin. Explorations in Monte Carlo Methods. New York: Springer Dordrecht Heidelberg, 2000. Print.[25] Kumar, P.M. and Torr, P.H.S. Fast Memory-Efficient Generalized Belief Propagation. University of Oxford. Retrieved from http://www.robots.ox.ac.uk/~tvg/publications/old_stuff/2006/GBP.pdf