neural network models on the prediction of tool wear in turning processes: a comparison study

NEURAL NETWORK MODELS ON THE PREDICTION OF TOOL WEAR IN TURNING PROCESS: A COMPARISON STUDY

Mahdi S. Alajmi ¹, Samy E. Oraby ² and Ibrahim I. Esat ¹

¹ Department of Mechanical Engineering, Brunel University

Uxbridge, Middlesex, UB8 3PH, UK ² Department of Mechanical Production, College of Technological Studies

Shuwaikh, Kuwait Email: [email protected], [email protected], [email protected]

Abstract

A reliable estimation of tool life in metal cutting process is the aim of many researchers. In this study tool life prediction modelled using different types of neural networks. A comparison study is conducted for Feed-forward Backpropagation Neural Network (FFBPNN), General Regression Neural Networks (GRNN), Elman Network (EN), and Radial Base Function Networks (RBFN) to find the best model. Tool life data used in the study was obtained previously, which included cutting speed, feed rate, depth of cut, cutting time, feed force (Fx), radial force (Fy), vertical force (Fz), notch wear, nose wear, and flank wear. A graphical study of the data reveals high non-linearity and early experiments carried out in this study using simple backpropagation network gave only marginally acceptable results. All neural networks models in this study are trained by the same experimental data acquired using Central Composite Design (CCD), which is one of the methods of experimental design (DOE). The report presents a competitive study of the performance of these networks for the tool life prediction problem. It is shown that FFBP is the best model among the others.

Keywords: Tool wear, Neural Networks, Turning

Process

1. Introduction

One of the most important considerations in machining process is tool wear and its relationship with process parameters. Tool wear is a complex phenomenon occurring in different and varied ways. Generally, worn tools have direct effect on the quality of surface finish, dimensional precision and ultimately the cost of the parts produced.

The prediction and detection of tool wear before the tool causes any damage on the machined surface becomes highly valuable in order to avoid loss of product, damage to the machine tool and associated loss in productivity

[1]. Prediction of tool wear using theoretical models is extremely difficult largely due to the nonlinear characteristics of the wear mechanisms [2]. Recently, many researchers have focused on the utilization of artificial intelligence in tool wear prediction starting such as genetic algorithm (GA) [3], simulating annealing [4], tabu search [5] and neural networks [6]. Neural networks have been successfully used in machining processes to predict tool wear and simulate the complex nature of the cutting process. Feed-Forward Back Propagation Neural Network, General Regression neural network, Elman Network, and Radial Base Function Network are four classes of neural networks. Feed-Forward Back Propagation Neural Network is widely used in machining processes. Only few applications of General Regression, Elman, and Radial Base Function Networks have been found in tool wear prediction compared to Feed-Forward Back Propagation Neural Network. Here, all these neural networks are applied to predict the tool wear in turning processes. Experimental data obtained from Oraby [7] are used to train the network models of this paper. The objective of this paper is to compare the prediction models of tool wear based on four types of neural networks: Feed-forward Backpropagation Neural Networks, General Regression Neural Network, Elman Network, and Radial Base Function Networks. The focus of this work is not only on the accuracy of the tool wears prediction, but also on practical usability of these networks. 2. Tool wear and tool life: Theoretical Background Tool-wear/tool-life investigation is a major topic in process planning and machining optimization. It is, therefore, essential to understand how different forms of tool-wear occur and exert their influence on the cutting tool performance. It is most commonly known that tool failure occurs due to accelerated wear or premature chipping, breaking, and thermal cracking [8].

Many factors have an influence on tool life, from which the most important are cutting speed, feed, cutting depth as well as type and hardness of the material of tool and the material of workpiece. Tool life can be defined as the length of time that cutting tool will cut before it must be replaced. Cutting-tool life is one of the most important economic considerations in metal cutting. First mathematical formula to approximate the tool life was given in 1907 by F. W. Taylor [9]:

nVT C= (1)

where T is cutting time (time) taken to develop a flank wear land of certain dimension, V is the cutting speed ( -1mm ), n is an exponent that depends on cutting conditions, and C is constant parameter.

For each combination of the workpiece and cutting tool materials, each cutting condition has it own exponent n and C, these exponents are vary with different tool and work materials. The extension of Taylor equation can be expressed as:

1

1 1 1

n

n m k

CTv d f

= (2)

where T is the tool life (min), V is the cutting speed (m/min), f is the feed rate (mm/rev), d is the depth of cut (mm), and C is constant for a given tool-work combination and tool geometry. 1/n is the exponent of the speed, 1/m is the exponent of the depth of cut, and 1/k is the exponent of the feed. C is a constant parameter.

3. Neural Networks (NN) Artificial neural networks (ANN) attempt to simulate the behaviour of human brain neurons. ANN has a parallel processing structure, which can divided into several processing procedures, which can be trained simultaneously. The neural network model is trained by using a set of data consisting of input and output variables. In the training process, the structure of the model is self-adjusted according to input-output data sets, and final model can be used for prediction. The manner in which the neurons of neural network are structured is intimately linked with the learning algorithm used to train the network. There are basically two main groups of training (learning) algorithms: supervised learning and unsupervised learning. In supervised learning, the correct results (target values, desired outputs) are known and are given to the network; so that the network can adjust it weights to try to match its outputs to the target values. Unsupervised learning is learning in the absence of target values; because the

network is trained using input signals only. In response, the network organizes internally to produce outputs that are consistent with a particular stimulus or group of similar stimuli. Inputs form clusters in the input space, where each cluster represents a set of elements of the real world with some common features. The capability of the network to obtain low value of error depend on several aspects, such as: network architecture, training algorithm, initial values of weights and biases, set of proposed examples, and number of training epochs. Neural networks have been shown to be successful as predictive tools in a variety of ways-predicting that some event will or will not occur, predicting the time at which an event will occur, or predicting the level of some event outcome. To predict with an acceptable level of accuracy, a neural networks must be trained with a sizable number of examples of past pattern together with known future outcome values. 3.1 Feed-Forward Back Propagation Neural Network Back propagation usually referred to as feed forwarded, multi-layered network with number of hidden layers. If the input of the network represented by ix , then the activation for a neuron is given by:

1

n

j i ji ji

V x w θ=

= −∑ (3)

where, jiw are the weights connecting node j to node i , and jθ is the threshold.

The output of neuron, jY is given by:

1

1j j

Y Ve−=

+ (4)

The error calculated at the thk layer is found by the following relationship:

( )2

1 1

0.5p m

k ki kii k

e d c= =

= −∑∑ (5)

where m is the number of output nodes, p the number of data, d desired output, and c is the calculated output. The error is then back propagated through the network in such a way that the weights are modified by amount of

kjW∆ , given by:

( ) ( 1)kkj kj

kj

eW I W I

Wη α

∂∆ = + ∆ −

∂ (6)

where I is the number of current iteration, η is the learning rate (lr), α the momentum parameter (mp), and ( 1)kjW I − is the weight correction in the previous iteration.

The back-propagation of the error is iterated for a number of iterations until an acceptable error tolerance level is achieved through the training process. 3.2 General regression neural network

Generalized Regression neural network is a one-pass learning algorithm and can generalize from examples as soon as they are stored. The network requires no prior knowledge of a specific functional from between input and output. In this network each input x is assumed to belong to one of k clusters where the number of patterns belonging to cluster j is jk for jk = 1, 2, …n. Before a General regression neural network can be constructed, the training group should be grouped into known clusters. If the number of training pattern is not too large, each pattern can act as an exemplar, that is, each cluster contains one pattern only ( 1jk = for all j ). Once the number of clusters and cluster centroid are known and the patterns scaled, the network can be created and trained. The architecture of this network consists of four layers, including the input layer, pattern layer, summation layer, and output layer. The input layer is used to distribute the input pattern to the next layer, the pattern unit layer. The pattern unit layer has k , one of each exemplar or one for each cluster. This layer is fully connected to the input layer through adjustable weights. The weight values are set equal to the exemplar patterns or to the cluster centre values. Thus, the weights on connections from the input layer to the thi pattern-layer unit have values equal to the thi cluster centroid vector. The output of the pattern layer is fully connected to the summation layer through adjustable weights. The output layer performs a division operation on the output of the summation layer units to produce the estimate z of the regression of z on x [10]:

2exp 221

ˆ2

exp 221

p DiAii

zp DiBi

i

σ

σ

− ∑ = = − ∑ =

(7)

where ( ) ( 1)i i jA A k A k z≡ = − + , ( ) ( 1) 1i iB B k B k≡ = − + and, p is the number of training patterns, σ is the smoothing parameter used to determines the decision surface boundary, and D is exponential functions(Gaussian), which is the pattern unit’s activation functions.

3.3 Elman network Elman network is one of the well-known recurrent neural networks. This network has three layers: input, hidden, and output layers. The hidden layer neurons or nodes use nonlinear activation functions (sigmoid functions) while the output layer neurons use linear activation functions. According to the general principle of the recurrent networks, there is a feedback from the outputs of some neurons in the hidden or output layer to neurons in the context layer, which seems to be an additional input layer. These feedback connections in Elman network are from the outputs of neurons in the hidden layer to the context layer units that are called context unit. The so-called context units are used to save the previous output values of the hidden layers neurons. Then, the values of the context units have been feedback fully connected to the hidden layer neurons. This means that there is a connection from every context unit to every hidden layer. Furthermore, there is recurrent connection from the hidden layers back to the context units. However, each hidden layer is only connected to associate context unit. The purpose of this unit is to deal with input pattern dissonance. In other words, pattern conflicts can possibly occur, resulting in multiple outputs produced from a single input pattern. 3.4 Radial Base Function Networks Radial Base Function networks are function approximation models that can be trained by examples to implement a desired input-output mapping. The performance of radial basis function neural network depends on the number and centers, and the method used for learning the input-output mapping. Radial Base Function Network has theoretical qualities such as: short learning time during the learning procedure, easy configuration, generally the network has only one hidden layer, and its learning paradigm, that enables an easy comprehension of how the Radial base function paradigms works in relation with other neural networks models

( )21 1( ) ( , ) ,

N N

i i ik k k ik kk k

y f x w x c w x cφ φ= =∑ ∑= = = −

1,2,...,i m= (8)

where 1nx ×∈ℜ is an input vector, ( )kφ ⋅ is a function

from +ℜ ( set of all positive numbers) to ℜ , 2⋅ denotes

the Euclidean norm, ikw are the weights in the outputs

layer, N is number of the hidden layer, and 1nkc ×∈ℜ are

the Radial base centres in the input vector space, for each neuron in the hidden layer, the Euclidean distance between its associated centre and the input to the network is computed. The output of the neuron in the hidden layer is a nonlinear function of the distance. Finally the output of the

network is computed as a weighted sum of hidden layer outputs.

4. Experimental Data

Experimental design (DOE) technique goal is extracting as much information as possible from a limited set of laboratory experiments. This knowledge extraction process is commonly referred to as sensitivity analysis, trend analysis, analysis of variance, or uncertainty quantification. There are many methods of design of experiments such as factorial design, fractional factorial design, central composite design, and non-central composite design. Central composite design (CCD) is one of the methods for experimental design and it is the most appropriate to the purpose of metal cutting modeling [11]. The complete arrangement of CCD consist of four blocks, each of them consist of four blocks, each of them containing six tests. Levels of cutting conditions are shown in table 1 and for convenience, these levels referred to as lowest, low, medium, high, and highest. The used data for training and testing the networks (Feed-forward Backpropagation Neural Network, General Regression Neural Networks, Elman Network, and Radial Base Function Networks) are from carefully designed experiments which are well conditioned with no serious problems [7]. The study used a number of turning tests conducted on Colchester Mascot 1600 turning lathe. The cutting conditions for turning process are presented in table 2. Besides the cutting conditions there are cutting forces: feed force (Fx), radial force (Fy), and vertical force (Fz) measured by a tool dynamometer which is designed to measure the cutting forces in turning process, then high precision three axis universal measuring optical microscope with accuracy of 510− of millimetre used to measure wear scars which had developed on the cutting edge. Levels of cutting conditions are shown in table 2. And for convenience, these levels are grouped as lowest, low medium, and high and highest. The procedure results trail listed in the Table 3. Each sub-test lasted for about two minutes per cut. The trail duration was terminated by either catastrophic failure of the tool or by the tool reaching a certain level of wear width equal to 0.3 mm.

The experiments produced 669 patterns, in which 440 patterns used for training the network and the rest used for verifying the NN. The Neural Network Toolbox of MATLAB is used for coding the algorithms on a personal computer; the program was executed and finished in a few minutes.

Table 1. Different Levels of Cutting Parameters

Table. 2 The Cutting condition for turning process

Tool insert Triple-Coated carbide tool inserts [Sandvik GC 435]

Tool holder Sandvik CSTPR T-Max

Workpiece material

Steel 709M40

Rake angle ( ° ) 6

°, 5

°, 0

°, 60

°, 90

°

Cutting Speed (m/min) 50-206

Feedrate (mm/rev)

0.06-0.3

Depth of cut (DOC) (mm) 1.5-3

Table 3. The central Composite Design (CCD)

Test V f D Test V f D

1 72 0.12 2 13 206 0.2 2.25

2 145 0.3 2 14 50 0.2 2.25

3 145 0.12 2.5 15 104 0.6 2.25

4 72 0.3 2.5 16 104 0.06 2.25

5 104 0.2 2.25 17 104 0.2 3

6 104 0.2 2.25 18 104 0.2 1.5

7 145 0.12 2 19 206 0.2 2.25

8 72 0.3 2 20 50 0.2 2.25

9 72 0.12 2.5 21 104 0.6 2.25

10 145 0.3 2.5 22 104 0.06 2.5

11 104 0.2 2.25 23 104 0.2 3

12 104 0.2 2.25 24 104 0.2 1.5

5. Results and discussion In this study, four types of neural networks are used to predict the nose, flank, and notch wears in turning process. Triple-Coated carbide tool inserts (Sandvik GC 435) are utilized to machine hardened and tempered alloy Steel (709M40). 669 patterns are involved in the modelling of

Parameter V (m/min) f (mm/rev) D.O.C (mm)

Lowest 50 0.06 1.50

Low 72 0.12 2.00

Medium 104 0.20 2.25

High 145 0.30 2.50

Highest 206 0.60 3.00

tool wear using different types of neural networks, where 440 patterns are used for training and 229 patterns are used for testing the network. Ten variables are involved in this process, cutting speed (V), feed rate (f), depth of cut (D), cutting time (t), feed force, radial force and vertical force, are considered as the input variables for the network and the three types of wear (Nose wear, Flank Wear, Notch wear) are considered as the target values for the network. The following sections will discuss the performance of each network based on the behaviour of the predicted data versus the experimental data. In addition, The Mean Square Error (MSE) is the error calculated as the difference between the target output and the network output for each network. The Table 4 present the results of the mean square error for neural networks used to predict the tool wear. The sufficiency of any network performance will be based on the mean square error result. Mean Square Error should approach to zero to satisfy the required output. Table 4 Mean Square Error for four types of networks

Networks FFBPNN GRNN EM RBFN MSE 0.000461 0.003100 0.001015 0.0084734

5.1 Feed-Forward Back Propagation From the graphical study in the Figure 1, the behaviour of the predicted data versus the experimental data reveals high nonlinearity and gave marginally good results. The performance of this network proved that Feed-Forward Back Propagation is obtaining the lowest mean square error value, (0.000461388) and shows the best graphs comparing to the other networks. That is because, Back Propagation is a gradient descent algorithm that compares actual outputs with desired outputs, and can reduce the error by back propagating it through the network and adjusting the weights.

(a)

(b)

(c) Figure 1 Experimental vs. NN Predicted for FFBP model: a) Nose Wear b) Flank wear and c) Notch wear. (* Experimental, NN Predicted) 5.2 General regression neural network In Figure 2, the predicted data for the General regression neural network exhibit a shifting in most segments in the three graphs of wear, visually the predicted data does not appear to be matching the experimental data. Moreover; the performance of this network produced mean square error of (0.003100). The network requires extensive computer resources for storing and processing all the training samples [12].

(a)

(b)

(c) Figure 2 Experimental vs. NN Predicted for GRNN model: a) Nose Wear b) Flank wear and c) Notch wear (* Experimental, NN Predicted) 5.3 Elman network The presented data in Figure 3 showed marginally acceptable results in the nose and flank wear graphs, while some shifting in segments in notch wear. Visually, these graphs are better than the graphs of General regression neural network and Radial Base Function Networks. The performance of this network produced mean square error of (0.00101503). Elman Network is susceptible to stability problems like many other recurrent systems. However, a comparison of Elman Network with Multilayer perceptron networks give similar tool wear estimation [13].

(a)

(b)

(c)

Figure 3 Experimental vs. NN Predicted for EN model: a) Nose Wear b) Flank wear and c) Notch wear (* Experimental, NN Predicted). 5.4 Radial Base Function Networks The Figure 4 represents the performance of the Radial Base Function Networks with mean square error of (0.00847349). The predicted data is far away from the experimental data and appeared as a linear pattern. This is because Radial Base Function Networks can be regarded as a special two-layer network, which is linear in the parameters by fixing all Radial Base Function Networks centres and nonlinearities in the hidden layer. Thus the hidden layer performs a fixed nonlinear transformation with no adjustable parameters and it maps the input space onto a new space. Thought this study for sake of comparison two layer networks were used. In the case of radial base function layer it appears that two layer configuration fails to represent non-linear behaviour. The output layer then implements a linear combiner on this new space and the only adjustable parameters are the weights of the linear combiner. Learning algorithm such as Genetic Algorithm (GA) and Regularized Orthogonal Least Square (ROLS) can be applied to the network to enhance the performance of the Radial Base Function Networks. The algorithms choose appropriate Radial Base Function Networks centres one by one from training data points until a satisfactory

network obtained. This will not necessary perform as required for our case. Because the network performance will be enhanced partially, and the performance shown in Figure 8 is not applicable.

(a)

(b)

(c) Figure 4. Experimental vs. NN Predicted for RBFN model (* Experimental, NN Predicted)

5.5 Training Time (TT) The tabulated results in table 5 shows the comparison in training time between the four different types of neural networks. For each network there are seven input variables, three output variables, and the number of epochs is 300 .The tabulated results showed that General Regression Neural Network obtained the lowest running time followed by the Feed Forward Back Propagation Neural Network, Elman Network, and Radial Base Function Networks respectively. Because General Regression Neural Network training speed is extremely fast; Due to the simplicity of the network structure and ease of implementation [12].

Table 5 Training time (TT) for the four types of NN.

Networks FFBP GRNN EN RBFN

TT (s) 7.5210 0.3100 12.789 175.66

6. Conclusion In this study, tool life prediction is modelled using different types of neural networks, which have been established on the pertinent variables. The networks, namely Feed Forward BackPropagation Neural Networks, General regression Neural Network, Elman Network, and Radial Based Function, have been trained by the same experimental data selected with the method of experiential design (DOE). According to the obtained results, the comparison shows that the Feed Forward BackPropagation Neural Networks model provides more accurate results (graphically and SME) than the other models. Graphical study of the data in Feed Forward BackPropagation Neural Networks reveals high non-linearity and gave marginally good results for the three types of wear followed by the Elman network, which is showed marginally acceptable results. The graphical study of General Regression neural Networks and Radial Base Function Networks show unreliable results for the three types of wear due to the nature of the networks. Based on the mean square error results, Feed Forward BackPropagation Neural Networks obtained the lowest MSE value among the other networks followed by Elman Network, General Regression Neural Network, and Radial Base Function Network. General Regression Neural Networks obtained the fastest training speed due to the simplicity of the network structure and ease of implementation. References [1] T. Ozel and A. Nadgir, "Prediction of flank wear by

using backpropagation neural network modeling when cutting hardened H-13 steel with chamfered and honed CBN tools," International Journal of

Machine Tools & Manufacture, vol. 42, pp. 287-97, 2002.

[2] J. H. Lee, D. E. Kim, and S. J. Lee, "Application of neural networks to flank wear prediction," Mechanical Systems & Signal Processing, vol. 10, pp. 265-276, 1996.

[3] X. Wang and I. S. Jawahir, "Optimization of multi-pass turning operations using genetic algorithms for the selection of cutting conditions and cutting tools with tool-wear effect," presented at Proceedings Joint 9th IFSA World Congress and 20th NAFIPS International Conference, 25-28 July 2001, Vancouver, BC, Canada, 2001.

[4] P. R. McMullen, M. Clark, D. Albritton, and J. Bell, "A correlation and heuristic approach for obtaining production sequences requiring a minimum of tool replacements," Computers & Operations Research, vol. 30, pp. 443-62, 2003.

[5] F. Kolahan, M. Liang, and M. Zuo, "Solving the combined part sequencing and tool replacement problem for an automated machining center: a tabu search approach," Computers & Industrial Engineering, vol. 28, pp. 731-743, 1995.

[6] S. Das, P. P. Bandyopadhyay, and A. B. Chattopadhyay, "Neural-networks-based tool wear monitoring in turning medium carbon steel using a coated carbide tool," Journal of Materials Processing Technology Proceedings of 1996 3rd Asia Pacific Conference on Materials Processing, Nov 12-14 1996, vol. 63, pp. 187-192, 1997.

[7] S. E. Oraby, " Mathematical Modeling and In-Process Monitoring Techniques for Cutting Tools." Sheffield, England: The University of Sheffield, 1989.

[8] G. Boothroyd and W. A. Knight, Fundamentals of Machining and Machine Tools. New York: Marcel Dekker, 1989.

[9] F. W. Taylor, "On the art of cutting metals," Transaction of American society of Mechanical Engineers, vol. 28, pp. 70-350, 1907.

[10] W. Patterson Dan, Artificial neural networks : theory and applications. Singapore ; London: Prentice Hall, 1996.

[11] S. E. Oraby and D. R. Hayhurst, "Development of Models for Tool Wear Force Relationships in Metal-Cutting," International Journal of Mechanical Sciences, vol. 33, pp. 125-138., 1991.

[12] E. W. M. Lee, C. P. Lim, R. K. K. Yuen, and S. M. Lo, "A Hybrid Neural Network Model for Noisy Data Regression," IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics, vol. 34, pp. 951-960, 2004.

[13] B. Sick, "On-line and indirect tool wear monitoring in turning with artificial neural networks: a review of more than a decade of research," Mechanical Systems and Signal Processing, vol. 16, pp. 487-546, 2002.