sv-lncs nerual … · web viewthe genetic drift would need to be changed in a way that would make...

Changeable Artificial Neural Network

Craig Lee Mark Adams

University of Derby

Abstract. This article will examine a new artificial neural network that is able to change and also have new functions that make it different from other neural networks. It is hoped that this new artificial neural network would be able to change how other neural network would be made in the future and give other people ideas on functions that could be added to neural networks. This article show it is passable for a neural network to be able to change itself and been able to learn to perform a task. However this article does demonstrated that the new artificial neural network and the genetic algorithm will need to have more research in order to be able to fully do any tasks and to be as efficiently than any other neural network.

Keywords: Artificial Neural Network, Genetic Algorithm, Hybrid Neural Network, Changeable Neural Network.

1 Introduction

This paper will examine a new neural network that has been created in the course of doing a Master of Science in Enterprise Computing. The neural network had been made able to change at the same time that it is being used than not just when the network has been training. It has been based on the feed-forward neural network (Castellini et al, 2009) but have many different changes that will cause networks to be able change without the uses of any genetic or evolutionary algorithms. There are other changes that have been made that make it far more different from other types of neural networks that have been made before. It is hoped that it will be able to be used in any applications that need artificial intelligence systems that are able to change to meet any new requirements. An example of applications the algorithm may be used for are computer games and antivirus systems. Many different neural networks have been made to solve the problem that this applications has (Chellapilla and Fogel, 2002) but this new neuron may be able to outperform some of the other neural networks.

1.1 Description of the Changeable Neural Network

This network compares to the feed-forward neural network in the way that it has layers and each layer has neurons that use axons to connect to each other (Anam et al, 2009). The difference is that each of the axons cannot just connect to the neuron on the next layer but also some of them may jump the layers to connect to a neuron in a different layer. One of the main differences is that the axons play a big role in this network. The role of the axon in this network is more of the role of the neuron in that the axons say what the axon output signal is going to be. It also has a Statistic Memory Output Modifier that is able to change the output of each of the axons and is able to cause the output of the network to be different. The other thing on how the axon is different is that sometimes it is activated and other times it is not activated. When it is not activated, it is invisible to the neuron that it is connect to. When the axon’s output is in a given range new axons or neurons are likely to be made. The new axon or neuron is added randomly.

1.1.1 Hidden Axon

The sigmoid function (Huang, 2009)

2

TL =

1.1.2. Output Neuron

The sigmoid function (Huang, 2009)

1.1.3 Hidden Neuron

NA =

NA is the amount of axon that will be add to the network because of this neuron.

NN

NN is the amount of neural that will be add to the network because of this neuron.

1.1.4 Add Axon

JLL= TL – (L + 1)

JL = Random value in up limit of JLL and low limit of one.

The JL number of layer that come after this layer that the axon will have a neuron to connect to.

JN = Random value in up limit of neuron in the Layer and 0.

The JN is the number of the neuron that going to be the endpoint of axon connect.

1.1.5 Neurons

JLL= HL - (L + 1)

JL = Random value in up limit of JLL and low limit of one.

The JL is the which layer that come after this layer that will have the neuron be added to but the neuron would not be able to be added to the output layer.

1.1.6 Variables

O = Output of axon.W = Weight of axon.A = Activation of the axon.In = Amount of time that input has come up.

4

E = Amount of time that input showed errors.Threshold = Amount needed to activate an axon. LL = low limit.HL = High limit.TL = Amount of life that an axon has.D = Amount of life lost or gained.R = Random number.RL = Random number low limit. RHL = Random number high limit.NW = New neuron has a value to say it is new.L = Layers.TL = Total amount of layers.HL = Total amount of hidden layers.

1.2 Learning Algorithms

There are two different learning algorithms that are used. It is expected that by using more than one learning algorithm that it would be able to perform any task given. The picture in Fig.1 & 2 show the network before and after the network has been training using only the Statistic Memory Output Modifier. It shows that the network is able to grow without the need of any genetic algorithm.

Fig. 1. This show the network just before it start to doing training. The green square are the neuron and the yellow line are the axons.

Fig. 2. This picture shows the network when it has been training for a task. It shows a new neuron that has been made. The orange line show axon that jump layers and the blue line show when there more than two axons are going from one neuron to different neuron.

Fig. 3. This picture shows that the network does sometime remove neurons and axons.

1.2.1 Statistic Memory Output Modifier

All the unique inputs are recorded by the axon and the amount of time they occur. When an error occurs in the program it would send an error signal to all the axons in the system. The axons would add a record of the error signal with the previous input record. When the axon gets the same inputs it will look-up the amount of time that this input has been seen before, then divide it by the amount of time that it has an error signal, then it is multiplied by 5. However, if the axons have not seen the inputs before then the output would

6

be 0.5 and then be multiplied by 5. It this then added to the inputs. The reason for having a Statistic Memory Output Modifier is to allow the axon to be able to learn whilst it is running.

1.2.2 Genetic Algorithm

The prototype uses a customer made genetic algorithm in order that the prototype would be able to learn and overcome any problem that the other learning algorithm could not overcome (YAO, 1999). This turns the network into chromosome objects that represent each of the layers, then into sub-chromosome objects that make up the chromosome object. The sub-chromosomes represent neurons. They have sub-chromosome objects in them to represent the axons. The only layer that is not represented are the inputs and the outputs. This genetic algorithm crosses over and mutates to the networks (Kaka, et al, 2009). The mutations are added to the neuron randomly. The genetic algorithm only uses the network that has the higher fitness rate for the cross-over. The cross-over works by randomly picking the lower chromosome to be added to the new network and then uses the lower chromosome to make up the higher chromosome. If one of the networks has more chromosomes than the other it randomly picks whether to add the spare chromosome to the new network. When done the chromosomes are used to make up a new network. The amount of the mutation can be changed in order to make the next generation different from the last generation.

2 Testing

The prototype was not the only neural network that was used in the test as a feed forward neural network with back-propagation algorithms. The forward neural network was made using AForge.NET.

Both of the neural networks will have training to do the task that is used in the test. The feed-forward network will be trained using the Back-propagation learning algorithms. It is training for 100 times with a learning rate of 0.5 par training iteration and the momentum of 0.2.

The prototype is trained by sending the inputs into the network and if the network gives out the wrong output the system would send an error signal to the network. This would be done for each of the inputs that the task would get in test and it would be done 10 times. This would be done with all networks that the program has got and when this is done they would be sent to the genetic algorithm. Then one of the new networks that was made from the genetic algorithm would be used for the test.

2.1. XOR problem

This test is to make the neural network to act like a XOR gate (Teuscher & Sanchez, 2000). So when the neural networks get the input it will produce the same output that a XOR gate would do. An example of this is that if the neural network gets as an input 1and 0, then its output would be 1 (Mouret and Doncieux, 2009).

The result from the test shows that the prototype was able to learn to do some of the XOR gate function but was not able to fully do the task. Also the feed-forward neural network was able to perform the task but it did need more training before it could outperform the prototype. The prototype takes more time to perform the test than the feed-forward neural network.

Fig. 3. This shows the amount of correct output by the networks par training set when doing the XOR test.

2.2. Classification dot patterns

In this test the network is given dot patterns and the patterns are made up of cells that have a value of 1 or 0. Each of the inputs are given one of the cells and each of the output neurons represent one of the patterns. The network have to work out which of the patterns from the inputs it has been given. There are four patterns that are used in this test. This test is to observe if the network is able to perform tasks in the area of pattern recognition (Fahlman and Lebiere, 1991).

The test shows that the prototype neural network was not able to fully get all the correct output for the tests. However it was able to get one of the output correctly when the network has completed two sets of training. The feed-forward neural network did obtain one correct output after the first set of training, but was not able to keep on giving the correct output after the second load of training. However, it was able to outperform the prototype after five lots of training but could not get all the output correctly.

8

Fig. 4. This shows the amount of correct output by the networks per training set when doing the classification dot patterns test.

2.3. Words Classification

In this test the network is given a four letter word with each of the letters in ASCII (Gorn et al, 1963). The input neuron has one of the letters and there are four output neurons. Each of the output neurons represent one of the words. The neuron network has to select the correct output neuron from the inputs it has been given. There are four words that are used in this test.

The feed-forward neural network gives out three correct outputs after the first set of training but after the scound set of training it could only achieve one correct output. The prototype could not give out any correct output until the three sets of training.

Fig. 5. This show the amount of correct output by the networks par training set when doing the words classification test.

3 Evaluation

The prototype has shown that it was able to learn but it could not outperform the feed-forward neural network when doing the different tests. The reason that the prototype did not perform as well was that the genetic algorithm had problems with genetic drift. The genetic drift would cause the network used in the test to become more similar to the other networks and this would make it less likely for a network to be able to perform the test (Mouret, Doncieux, 2009). Also the amount of networks that were used may have been to lower the number of networks that could have made the genetic drift to effect all of the networks more quickly (Farley, 2007). The genetic drift would need to be changed in a way that would make it less likely that the genetic drift would occur. Also when using the genetic algorithm it should have a larger population of neural networks that are similar to each other. The other problem may have been the way that the prototype was trying to learn when the network is running. This may need to be changed so that the network would be able to learn without the use of the genetic algorithm but at the same time being able to change its own structure. When performing the test it was observed that the prototype took up more memory whilst in training or in use then the feed-forward neural network. This does happen when any new axons or neurons are made. This could cause a problem when the network starts to make thousands or millions of axons and neurons as this may take up more memory than a computer does have. There have been a variable that have been added that limit the amount of axons that any neuron can have but the genetic algorithm changes the variable. Also there is no variable limit on the amount of neurons that a layer can have. This could be changed in a new version so that each layer of neurons can only have a limited amount of neurons at any one time. The other problem is that the prototype takes more time than the feed-forward neural network in performing the test. To solve this problem the prototype should be more multithreaded (Kim and Cho, 2008).

4 Conclusion

The study showed that the algorithm was able to learn, but it was not able to fully learn to do any of the tasks that it was given to do in the tests. It may have been that training of the algorithm was too little or it could have been the learning algorithm. It also shows that the network was able to change its only structure without the need of any genetic algorithm but only to a limit. Also it was not able to perform as well as the feed-forward neural network with back-propagation in any of the testing. However with changes in new versions it would be anticipated that the algorithm would be able to outperform feed-forward neural networks and other networks. It is expected that the new neural network would be able to be used in some real life application. The algorithm would need to be changed to be able to perform any real

10

life tasks and the algorithm to be able to learn better when it is not using any genetic algorithm and also to be more changeable.

References

1. Anam, Sarawat., Islam, Shohidul., Kashem, M., Islam, M.N., Islam, M.R., Islam, M.S. (2009) Face Recognition Using Genetic Algorithm and Back Propagation Neural Network, Proceedings of the International MultiConference of Engineers and Computer Scientists 2009, Vol. 1 pp. 1 – 4 Citeseer [Online] Available at: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.148.5983&repl&type=pdf (Accessed on: 15/06/2011).

2. Castellini, Alberto., Manca, Vincenzo., Suzuki, Yasuhiro. (2009) Metabolic P System Flux Regulation by Artificial Neural Networks, Workshop on Membrane Computing, pp. 196 – 209 Research Group on Natural Computing [Online] Available at: http://www.gcn.us.es/files/169castelliniWmc10.pdf (Accessed on: 27/08/2011).

3. Chellapilla, Kumar., Fogel, David. (2002) Evolution, Neural Networks, Games, and Intelligence, Proceedings of the IEEE, pp 1471 - 1496, IEEE [Online]. Available at: http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=784222 (Accessed on : 13/06/2011).

4. Fahlman, S., Lebiere, C. (1991) The Cascade-Correlation Learning Architecture, Advances in Neural Information Processing Systems 2, pp 524 – 532, Morgan Kaufmann [Online]. Available at: http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.125.6421 (Accessed on : 13/06/2011).

5. Farley, Arthur. (2007) Choice and development, GECCO ’07 Proceedings of the 2007 GECCO conference companion on Genetic and evolutionary computation, pp. 2468 – 2474 ACM[Online] Available at: http://portal.acm.org/citation.cfm?id=1274012 (Accessed on: 22/07/2011).

6. Gorn. S., Berner. R., Green. (1963) American standard code for information interchange, Communications of the ACM, Vol. 6 (8) ACM [Online] Available at: http://dl.acm.org/citation.cfm?id=367524 (Accessed on: 27/08/2011).

7. Huang, Yanbo., (2009) Advances in Artificial Neural Network – Methodological Development and Application, Algorithms, pp. 973 – 1007 MDPI[Online] Available at: http://www.mdpi.com/1999-4893/2/3/973 (Accessed on: 27/07/2011).

8. Kim, Kyung-Joong., Cho, Sung-Bae. (2008) Diverse Evolutionary Neural Network Based on Information Theory, Neural Information Processing Lecture Notes in Computer Science, Vol. 4985 pp. 1007 – 1016 Springer[Online]

http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.148.5983&repl&type=pdf

http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.148.5983&repl&type=pdf

Available at: http://www.springerlink.com/content/d18245667r830274/ (Accessed on: 22/07/2011).

9. Kohl, Nate., Mikkulainen Risto. (2009) Evolving Neural Networks for Strategic Decision-Making Problems, Neural Networks, Vol 22 (3) pp. 326 – 337, ScienceDirect[Online] available at: http://www.sciencedirect.com/science/article/pii/S0893608009000379 (Accessed on: 01/06/2011).

10. Mouret, Jean-Baptiste, Doncieux, Stephane. (2009) Evolving modular neural-networks thought exaptation , Evolutionary Computation, pp. 1570 – 1577 IEEE[Online] Available at: http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=4983129 (Accessed on 13/ 07/2011).

11. Teuscher, Christof., Sanchez, Eduardo. (2000) A Revival of Turing’s Forgotten Connectionist Ideas: Exploring Unorganized Machines, proceedings of the Sixth Neural Computation and Psychology Workshop, pp 153 – 162, CiteSeer [Online] Available at: http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.3.4044(Accessed on : 13/06/2011).

12. Yao, Xin. (1999) Evolving Artificial Neural Networks, Proceedings of the IEEE, pp 1423 – 1447, IEEE [Online]. Available at: http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=784219(Accessed on : 13/06/2011).

.

12

sv-lncs nerual … · web viewthe genetic drift would need to be changed in a way that would make...

Documents