analysis of neural network modelapplied for efficient

7
International Journal of Emerging Trends & Technology in Computer Science (IJETTCS) Web Site: www.ijettcs.org Email: [email protected] Volume 6, Issue 3, May- June 2017 ISSN 2278-6856 Volume 6, Issue 3, May – June 2017 Page 328 Abstract Thispaper utilizes a new mean square error function while training Backpropagation (BP) neural networks for solving pattern classification problems. The new error function is referred as arctan mean square error (AMSE).The modification of MSE is optimal in the sense of convergence rate, training time and simulation time for an asymptotically large number of statistically independent training patterns. The Neural Network concept is used which is based on “Learn by example”. Experimental results are being presented. The application selected is classification of offline English Hand Written characters. IndexTerms—Backpropagation algorithm, Mean square error, Character Recognition. I. INTRODUCTION Artificial neural networks has proved to be an efficient tool in classification and also in prediction of arrays of various patterns.Backpropagation (BP) Neural Network classifiers shows good performance in wide range of applications. Training this classifier proves to the minimization of a cost function over an available training set. For pattern classification in general and for BP in particular, the cost function that is more efficiently being used than any other alternative is the standard mean square error(MSE).The standard mean square error is advantageous for having no prior knowledge for class distributions[1].This paper proposes training BP neural network using a new cost function referred to as the arctan mean square error(AMSE).The AMSE is optimal in yielding a network output with minimum variance for asymptotically large number of statistically independent training pattern, with respect to standard MSE. This paper also compares the performance of BP neural network trained by AMSE with respect to standard MSE. The application selected is the recognition of English handwritten characters. This paper is organized as follows: Section 2 introduces the MSE and AMSE. Section 3 employs BP classifiers trained by minimizing MSE through the deployment of AMSE in hand written offline character recognition. Comparison between the performance of AMSE and standard MSE with respect to training time, simulation time and recognition accuracy is being shown in section 7. Finally section 8 contains the conclusion of the given work along with the results is depicted. II. PERFORMANCE METRICS Artificial Neural Network is a representation of human brain that tries to learn and recognize its training input patterns. For training an input pattern and measuring its performance, a performance function must be defined. The various functions are:- 1.) Mean Squared error(MSE) The Mean Squared error is defined as ܯ ܧ= ( ) Where, t pi = Predicted value for data point i; y pi =Actual value for the data point i; N = Total number of data points 2.) Training with gradient descent and momentum[12] Gradient descent with momentum, implemented by traingdm, allows a network to respond not only to the local gradient, but also to recent trends in the error surface. Acting like a low pass filter, momentum allows the network to ignore small features in the error surface. Without momentum a network can get stuck in a shallow local minimum. With momentum a network can slide through such a minimum. Gradient descent with momentum depends on two training parameters. The parameter ‘lrindicates the learning rate, similar to the simple gradient descent. The parameter mc is the momentum constant that defines the amount of momentum. ‘mcis set between 0 (no momentum) and values close to 1 (lots of momentum). A Analysis of Neural Network ModelApplied for Efficient Classification of Patterns Shobhit Kumar 1 , Raghu Nath Verma 2 , Anil Kumar 3 ,Sapna Singh 4 1 Information Technology Department, Rajkiya Engineering College, Ambedkar Nagar, India. 2,3 Computer Science and Engg.Department, Bundelkhand Institute of Engineering and Technology, Jhansi(India) 4 Computer Science and Engg. Department, Allen House Institute of Technology, Kanpur (India)

Upload: others

Post on 15-Jun-2022

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Analysis of Neural Network ModelApplied for Efficient

International Journal of Emerging Trends & Technology in Computer Science (IJETTCS) Web Site: www.ijettcs.org Email: [email protected]

Volume 6, Issue 3, May- June 2017 ISSN 2278-6856

Volume 6, Issue 3, May – June 2017 Page 328

Abstract Thispaper utilizes a new mean square error function while training Backpropagation (BP) neural networks for solving pattern classification problems. The new error function is referred as arctan mean square error (AMSE).The modification of MSE is optimal in the sense of convergence rate, training time and simulation time for an asymptotically large number of statistically independent training patterns. The Neural Network concept is used which is based on “Learn by example”. Experimental results are being presented. The application selected is classification of offline English Hand Written characters.

IndexTerms—Backpropagation algorithm, Mean square error, Character Recognition.

I. INTRODUCTION Artificial neural networks has proved to be an efficient tool in classification and also in prediction of arrays of various patterns.Backpropagation (BP) Neural Network classifiers shows good performance in wide range of applications. Training this classifier proves to the minimization of a cost function over an available training set.

For pattern classification in general and for BP in particular, the cost function that is more efficiently being used than any other alternative is the standard mean square error(MSE).The standard mean square error is advantageous for having no prior knowledge for class distributions[1].This paper proposes training BP neural network using a new cost function referred to as the arctan mean square error(AMSE).The AMSE is optimal in yielding a network output with minimum variance for asymptotically large number of statistically independent training pattern, with respect to standard MSE. This paper also compares the performance of BP neural network trained by AMSE with respect to standard MSE.

The application selected is the recognition of English handwritten characters.

This paper is organized as follows: Section 2 introduces the MSE and AMSE. Section 3 employs BP classifiers trained by minimizing MSE through the deployment of AMSE in hand written offline character recognition. Comparison between the performance of AMSE and

standard MSE with respect to training time, simulation time and recognition accuracy is being shown in section 7. Finally section 8 contains the conclusion of the given work along with the results is depicted. II. PERFORMANCE METRICS Artificial Neural Network is a representation of human brain that tries to learn and recognize its training input patterns. For training an input pattern and measuring its performance, a performance function must be defined. The various functions are:-

1.) Mean Squared error(MSE) The Mean Squared error is defined as

푀푆퐸 =ퟏ푵 (풕풑풊 − 풚풑풊)ퟐ

풊 ퟏ

풑 ퟏ

Where, tpi= Predicted value for data point i;

ypi =Actual value for the data point i;

N = Total number of data points

2.) Training with gradient descent and momentum[12] Gradient descent with momentum, implemented by traingdm, allows a network to respond not only to the local gradient, but also to recent trends in the error surface. Acting like a low pass filter, momentum allows the network to ignore small features in the error surface. Without momentum a network can get stuck in a shallow local minimum. With momentum a network can slide through such a minimum.

Gradient descent with momentum depends on two training parameters. The parameter ‘lr’ indicates the learning rate, similar to the simple gradient descent. The parameter mc is the momentum constant that defines the amount of momentum. ‘mc’ is set between 0 (no momentum) and values close to 1 (lots of momentum). A

Analysis of Neural Network ModelApplied for Efficient Classification of Patterns

Shobhit Kumar1, Raghu Nath Verma2, Anil Kumar3 ,Sapna Singh4

1Information Technology Department, Rajkiya Engineering College, Ambedkar Nagar, India.

2,3Computer Science and Engg.Department, Bundelkhand Institute of Engineering and Technology, Jhansi(India)

4Computer Science and Engg. Department, Allen House Institute of Technology, Kanpur (India)

Page 2: Analysis of Neural Network ModelApplied for Efficient

International Journal of Emerging Trends & Technology in Computer Science (IJETTCS) Web Site: www.ijettcs.org Email: [email protected]

Volume 6, Issue 3, May- June 2017 ISSN 2278-6856

Volume 6, Issue 3, May – June 2017 Page 329

momentum constant of 1 results in a network that is completely insensitive to the local gradient and, therefore, does not learn properly.)

3.) Learning rate-Training parameter that controls the size of weight and bias changes during learning. A learning rule is a model for the types of methods to be used to train the system, and also

a goal for what types of results are to be produced. The learning algorithm is the specific mathematical method that is used to update the inter-neuronal synaptic weights during every training iteration [wiki].

The network learns by adjusting weights. The process of adjusting the weights to make the neural network learn the relationship between the input and targets is known as learning or training. There are several techniques for training a network gradient descent method which is the most common.

III. PROPOSED ARCTAN MEAN SQUARE ERROR In statistics, the mean squared error (MSE) of an estimator is one of many ways to quantify the difference between values implied by an estimator and the true values of the quantity being estimated. MSE is a risk function, corresponding to the expected value of the squared error loss or quadratic loss. MSE measures the average of the squares of the "errors." The error is the amount by which the value implied by the estimator differs from the quantity to be estimated. The difference occurs because of randomness or because the estimator doesn't account for information that could produce a more accurate estimate [wiki].

Minimizing squared error would increase the recognition speed of a particular system with defined number of input training dataset. The decrease in the value of error is evaluated by a mathematical term known as ‘arctan’. Arctan is the term defined for ‘inverse tangent’ of a particular value.

The standard Mean Square Error value can be estimated by the formula

푀푆퐸 =ퟏ푵 (풕풑풊 − 풚풑풊)ퟐ

풊 ퟏ

풑 ퟏ

Where, tpi= Predicted value for data point i;

ypi =Actual value for the data point i;

N = Total number of data points

The standard mean square error is advantageous for having no prior knowledge for class distributions. It is widely used as it results in the least error value as compared to other error values. Minimizing MSE will further result in more accurate system for pattern recognition and other applications. Arctan Mean Square error (AMSE) has proved further minimization of the MSE. It is advantageous for a system with large dataset, where

thousands of values are to be enumerated to result thousand of values. It has proved useful in calculation of error for Backpropagation NN.

The Arctan mean squared error can be estimated by the following formula

퐴푀푆퐸 =ퟏ푵 (풂풓풄풕풂풏(풕풑풊 − 풚풑풊))ퟐ

풊 ퟏ

풑 ퟏ

Where, tpi= Predicted value for data point i;

ypi =Actual value for the data point i;

N = Total number of data points

IV. CHARACTER RECOGNITION Each multilayer perceptron (MLP) network have two-layer feed forward network with various nonlinear functions. The Recognition of handwritten English characters using MLP can be performed by using the following sequence which is shown in Figure 5.1. These steps that can be used for Hand written English Character Recognition (HECR): Step 1: Scanning the handwritten English Characters. Step 2: Converting these characters into Binary characters i.e. in 1 and 0. Step 3: Preprocessing the characters i.e. Skeletonization and Normalization [11]. Step 4: Recognizing the patterns using feature extraction method. Step 5: Finally desired output character is obtained

Figure 1.1 Steps for Character Recognition

Handwritten Characters

Gray Scaling

Binarization

Pre-Processing

Segmentation

Feature Extraction & Classification

Error Correction Post-Processing

Final Recognized characters

Page 3: Analysis of Neural Network ModelApplied for Efficient

International Journal of Emerging Trends & Technology in Computer Science (IJETTCS) Web Site: www.ijettcs.org Email: [email protected]

Volume 6, Issue 3, May- June 2017 ISSN 2278-6856

Volume 6, Issue 3, May – June 2017 Page 330

In HECR system, MLP classifiers are constructed and then these classifiers are included i.e. hidden layer. HECR system is the combination of the following two steps:

1.) Each MLP network classifier is being trained and constructed. Some English characters resemble almost the same. So, the structure of each MLP classifier is being determined. Every classifier has some characteristics and based on those features, classifiers are being incorporated. For this each MLP classifier is being trained using Error Back propagation (EBP) algorithm.

2.) In this step, the MLP network classifiers which are included in the hidden layer are trained using the EBP algorithm and training is done until meansquare error of the network output and desired output falls below e-04 or the maximum epoch has been reached.

V. RESEARCH METHODOLOGY The Artificial Neural Network (ANN) and Gradient Descent learning techniques have been used for predicting the software effort using dataset of software projects in order to compare the performance results obtained from these models.

(A) Empirical data collection: The data we have used is collected from 100 different peoples handwritten characters mostly used software data sets for software effort predictions. The characters are converted into 30x30 pixel values each. (B) Steps for error and recognition accuracy calculation: The training data are being collected and a network is created with a predefined set of parameters. A threshold value is set that decides whether the obtained output meets the goal or not. The various steps for finding the number of training data being recognized are being depicted through the flowchart below Figure 1.2.

Figure 1.2 Descriptive statistics of the data set is appended

at Table 1.1.

VI. THE BACKPROPAGATION APPROACH In the deployment of the Backpropagation algorithm, every epoch of training process involves the following steps:1) the training data’s are being fed to the network in a forward direction, that produces the results at the output layer, 2) error is being calculated at the output layer which is based on the predefined target information, while the necessary changes in the weights for each neuron that lead to the output layer of the network are determined based upon the error calculation, 3) the changes in the weights are preceded to the network through the various layers, until all the necessary weight modifications are considered for the whole network. The calculated change in weights is then provided throughout the network, the iteration begins again, and the entire process is repeated using the next pattern to be trained. An efficient system is one in which the network is able to recognize complex patterns in noisy environment.

The Artificial Neural Network (ANN) and Gradient Descent with momentum learning techniques have been employed for predicting the software efforts using dataset of handwritten English characters in order to compare the performance results of backpropagation neural network. VII. EXPERIMENTATION AND TRAINING CONSIDERATION Before training, all of the network weights were randomly initialised. Since the weights of the networks were randomly initialised, the results of different runs on the same training and test sets were different. The usual way of dealing with this is to take results averaged over 10 or more runs. But this is reduced by fixation of weight for each run. However, our main concern was to find if it is efficient to implement the Arctan mean square error for the Backpropagation networks for character recognition and whether it would give better results over a larger number of runs in future or not.

Each network was then trained by presenting it with all the patterns in the training set for a number of cycles until the error of the network was reduced to a small value which is shown in Figure 1.3. The error reported for all patterns is the sum of the quadratic difference between the desired output and the real output over all patterns in the training set.

For most of the networks, very small error values could have been achieved by training the network for a longer time.

However, it was found that this improved the memorisation ability of the network but reduced its ability to generalise for unseen samples. To achieve better generalisation results it was decided to stop training just after an error value became less than 0.001, when that could be achieved.

Page 4: Analysis of Neural Network ModelApplied for Efficient

International Journal of Emerging Trends & Technology in Computer Science (IJETTCS) Web Site: www.ijettcs.org Email: [email protected]

Volume 6, Issue 3, May- June 2017 ISSN 2278-6856

Volume 6, Issue 3, May – June 2017 Page 331

Further, patterns were shuffled while training so that the various patterns were presented in different orders (random sequence) in the different cycles.

This was to avoid presenting the network with, for example, all genuine signatures followed by all forged ones, in which case the network would forget what it had learnt first.

VIII. CALCULATIONS FOR MEAN SQUARE ERROR VALUE The tabulated data for the standard mean square value obtained from the formula discussed in chapter 4 is being depicted in Figure 1.2 where eachgraph is separately shown for each parameter.

Table 1.1 Mean Square Error Value for Various No. of

Training Patterns

The standard mean square error is being calculated for various numbers of training patterns so as to compare our formulas and functions.

VIII. CALCULATIONS FOR ARCTAN MEAN SQUARE ERROR VALUE The modified mean square error value calculated after training the backpropagation network is being shown in Table 1.2 where each parameter isdepicted separately.

Table 1.2 Arctan Mean square error value for various no. of training patterns

The modified mean square error i.e. Arctan Mean Square is being calculated for various numbers of training patterns so as to compare our formulas and parameters with the standard MSE.

The following data’s are being observed and through these data it could be concluded that Arctan Mean

Square Error shows better results on comparison with Mean square error.

The various samples of handwritten characters of English alphabet is collected from 100 different peoples having different handwriting containing noise is being shown in figure 1.3

Figure 1.3: various samples of handwritten characters of English alphabet

IX. PERFORMANCE COMPARISON The Mean square error and the proposed Arctan mean square error are being compared with respect to error, training and simulation time of a network. The implementation is done using 500 data of 900 x 500 pixels each. The network is being fixed so as to check the validity

MEAN SQUARE ERROR

No. of Input Patterns

Error Training

Time (ms)

Simulation time (ms)

Iterations

(epoch) MSE

2 9.8893 e-04 13.2634 0.1695 801

5 9.8747 e-04 12.2116 0.0479 815

10 9.9613 e-04 11.4958 0.0511 811

100

9.8721 e-04 15.328

1 0.0614 796

200 9.8721 e-04 12.531

0 0.0711 796

ARCTAN MEAN SQUARE ERROR

No. of Input Patterns

Error Traini

ng Time(

ms)

Simulation time(ms)

Iterations

(epochs)

AMSE

2 9.8782 e-04 6.3532 0.0640 403

5 9.8652 e-04 6.8543 0.0568 405

10 9.9575 e-04 5.7141 0.0593 406

100 9.7496 e-04 7.6779 0.1247 396

200 9.7496 e-04 10.544

1 0.0642 396

Page 5: Analysis of Neural Network ModelApplied for Efficient

International Journal of Emerging Trends & Technology in Computer Science (IJETTCS) Web Site: www.ijettcs.org Email: [email protected]

Volume 6, Issue 3, May- June 2017 ISSN 2278-6856

Volume 6, Issue 3, May – June 2017 Page 332

and performance of proposed network. The data’s for which being compared are tabulated in the Table 1.3.

The graph for the respective patterns are being compared and shown by Figure 1.4. The conclusions that can be drawn from the graph regarding the relative error value, the total number ofepochs being iterated i.e. convergence factor, the training time and the simulation time.

Table 1.3 Comparative Analysis of the various parameters for varying patterns

Figure 1.4 a) Comparison between Arctan MSE and standard MSE b)Training time obtained when trained by MSE and AMSE separately c) Simulation time when compared between Arctan MSE and standard MSE d) Number of epochs when trained by MSE and AMSE separately.

COMPARISON BETWEEN MSE AND AMSE FOR 500 NUMBERS OF PATTERNS In the above tables and graphs the results was individually shown. Now a comparative analysis of the two error values for 500 data’s are being calculated and tabulated. Table1.4Comparison between MSE and AMSE w.r.t. error,

training time and no. of iterations

The result shows better performance for proposed work being applied to offline handwritten character recognition. The Table 1.4 clearly depicts that the error calculated for Arctan mean square error shows lesser values as compared to

Figure 1.5 Comparative analysis on various factors w.r.t.

MSE and AMSE

0.00E+001.00E+022.00E+023.00E+024.00E+025.00E+026.00E+027.00E+028.00E+029.00E+02

MSE

AMSE

Performance Functi

on

Network

Used

Training

Algorithm

No. of

Input

Pattern

Error

Training time

Simulation time

Iteratio

n

MSE

Back Propagation

Gradient

Descent

with momentum

500 9.7567 e-04

14.0404

0.0574

797

AMSE Back Propagation

Gradient

Descent

with momentum

500 9.9509 e-04

9.7783

0.0564

397

Page 6: Analysis of Neural Network ModelApplied for Efficient

International Journal of Emerging Trends & Technology in Computer Science (IJETTCS) Web Site: www.ijettcs.org Email: [email protected]

Volume 6, Issue 3, May- June 2017 ISSN 2278-6856

Volume 6, Issue 3, May – June 2017 Page 333

thestandard mean square error value. After training the data using Backpropagation neural network; the network takes lesser training time for AMSE as compared to MSE. The graph for all the various parameters is shown in Figure1.5 Finally, the two performance functions are being compared with respect to error, training time, simulation time and number of iterations. The graph clearly shows that the AMSE values are lesser as compared to the MSE values which subsequently prove that the research done has shown positive results w.r.t. the standard network. Theparameters that show positive results arethe rate ofconvergence and training time. X. RESULT The tables and graphs clearly depict the performance of the proposed arctan mean square error shows reduced error value with respect to the mean square error value. The network is trained for 100 different data values. The training time shows quite variations in its values but the simulation time i.e. the recognition time gives lower time values AMSE w.r.t. MSE. REFERENCES [1] Dr. Dhafer r. Zaghar “Reduction of the error in the

hardware neural network”, Al-Khwarizmi Engineering Journal, Vol.3, No.2, pp1-7 (2007).

[2] Bogdan M. Wilamowski, SerdarIplikci, OkyayKaynak, M. ÖnderEfe,”An Algorithm for Fast Convergence in Training Neural”, 0-7803-7044-9/01/$10.00 ©2001 IEEE.

[3] Hussein Rady, Reyni’s Entropy and Mean Square Error for Improving the Convergence of Multilayer Backpropagation Neural Networks : A Comparative ,Study117905-8282 IJECS-IJENS Vol: 11 No: 05,October 2005.

[4] Hossam Osman, Steven D. Blostan,New cost function for Backpropagation neural network with application to SAR imagery classification,K7l396.

[5] G. R. Finnie and G.E. Wittig, “A Comparison of Software Effort Estimation Techniques: Using Function Points with Neural Networks, Case Based Reasoning and Regression Models,” Journal of Systems and Software, vol.39, pp.281-289, 1997.

[6] G. R. Finnie and G.E. Wittig, “AI Tools for Software Development Effort Estimation,” Proceedings of the International Conference on Software Engineering: Education and Practice (SEEP’ 96).

[7] K. Srinivasan and D. Fisher, “Machine Learning Approaches to Estimating Software Development Effort,” IEEE Transactions on Software Engineering, vol.21, Feb.1995.

[8] SandhyaArora, DebotoshBhattacharjee, MitaNasipuri, L. Malik , M. Kundu and D. K. Basu, Performance Comparison of SVM and ANN for Handwritten Devnagari Character Recognition, IJCSI International Journal of Computer Science Issues, Vol. 7, Issue 3, May 2010.

[9] Srinivasakumardevireddy, Settipalliapparao, Hand written character recognition using backpropagation network, Journal of Theoretical and Applied Information Technology,2005 - 2009 JATIT.

[10] MadhuShahi, Dr. Anil K Ahlawat, Mr. B.N PandeyLiterature Survey on Offline Recognition of Handwritten Hindi Curve Script Using ANN Approach, International Journal of Scientific and Research Publications, Volume 2, Issue 5, May 2012 ,ISSN 2250-3153.

[11] Naveen Garg, SandeepKaur, Improvement in Efficiency of Recognition of Handwritten Gurumukhi Script, IJCST Vol. 2, Issue 3, September 2011.

[12] http://www.mathworks.in/help/nnet/ref/traingdm.html AUTHOR

SHOBHIT KUMAR Belongs toLucknow the capital of Uttar Pradesh. Mr. Shobhitkumar has received his primary education and professional education both from Lucknow. He completed his High school and Intermediate from UP Board. Mr. Shobhitkumar has

received theBachelors & Masters in Technology degree in Computer Science and Engineering from Dr. A. P. J Abdul Kalam Technical University (Formerly Known as UPTU).And Pursing Phd in Computer Science and Engineering from the same University.He has worked as lecturer and Asst. Prof. in many reputed Engineering colleges of Uttar Pradesh technical university and currently working as an Asst. Prof. in Rajkiya Engineering College, Ambedkar Nagar.(A Government Engineering College) affiliated to Dr. A. P. J. Abdul Kalam Technical University (Formerly Known as UPTU and GBTU).

Dr. Ragh Nath Vermahas completed his B.Tech degree in Computer Science from IET, Lucknow, M.Tech degree in Digital Electronics and Systems from KNIT, Sultanpur(India), and his Ph.D degree in computer Science and Engineering from Uttrakhand

Technical University Dehradun (India). He is currently workingas Assistant Professor in ComputerScience and Engineering DepartmentatBundelkhand Institute of Engineering and Technology, Jhansi.He has great Experience in the field of ComputerScience and also has commendable and growing list of Publications and is a member of reputed professional societies.

Dr. A. K. Solankiis currently working as Professsor in Computer Science and Engineering Departmentat Bundelkhand Institute of Engineering and Technology, Jhanshi.He has great Experience in the field of Computer Science and also has commendable and

Page 7: Analysis of Neural Network ModelApplied for Efficient

International Journal of Emerging Trends & Technology in Computer Science (IJETTCS) Web Site: www.ijettcs.org Email: [email protected]

Volume 6, Issue 3, May- June 2017 ISSN 2278-6856

Volume 6, Issue 3, May – June 2017 Page 334

growing list of Publications and is a member of reputed professional societies.

Sapna Singh received the B.Tech. andM.Tech. degrees in Computer Science and Engineering from Uttar Pradesh Technical University, Lucknow in 2010 and 2013, respectively. Now she is Asst. Professor at Allen House Institute of Technology, Kanpur, India. Her area of Interest include Neural

network, Image processing and Data structure. She has published a number of research papers in reputed National/International Journals and Conferences.