selecting an artificial neural network for efficient modeling and

Upload: vaalgatamilram

Post on 06-Apr-2018

217 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/3/2019 Selecting an Artificial Neural Network for Efficient Modeling And

    1/12

    International Journal of Machine Tools & Manufacture 42 (2002) 663674

    Selecting an artificial neural network for efficient modeling andaccurate simulation of the milling process

    Jorge F. Briceno a, Hazim El-Mounayri a,, Snehasis Mukhopadhyay b

    a Mechanical Engineering Department at Indiana University, Purdue University, Indianapolis (IUPUI), 723 W. Michigan Street, SL 260,

    Indianapolis, IN 46202-5132, USAb Department of Computer and Information Science at IUPUI, 723 W. Michigan Street, Indianapolis, IN, USA

    Received 16 August 2001; accepted 15 January 2002

    Abstract

    In this paper, two supervised neural networks are used to estimate the forces developed during milling. These two ArtificialNeural Networks (ANNs) are compared based on a cost function that relates the size of the training data to the accuracy of themodel. Training experiments are screened based on design of experiments. Verification experiments are conducted to evaluate these

    two models. It is shown that the Radial Basis Network model is superior in this particular case. Orthogonal design and specificallyequally spaced dimensioning showed to be a good way to select the training experiments. 2002 Elsevier Science Ltd. Allrights reserved.

    Keywords: End milling; Artificial neural networks; Back propagation; Radial basis

    1. Introduction

    As one of the most useful methods of metal cutting,the milling process attempts to remove an amount ofmaterial through chip formation by the two continuousmotions of a tool and a workpiece (see Fig. 1). In this

    Fig. 1. Flat-end milling process.

    Corresponding author. Tel.: +1-317-278-3320; fax.: +1-317-274-9744.

    E-mail address: [email protected] (H. El-Mounayri).

    0890-6955/02/$ - see front matter 2002 Elsevier Science Ltd. All rights reserved.

    PII: S0 8 9 0 - 6 9 5 5 ( 0 2 ) 0 0 0 0 8 - 1

    case, the tool has a rotational motion (expressed byspindle speed) and the workpiece a linear movement(expressed by feed rate). The cutting edge is in contactwith the material at many points, which changedepending on the position of the edge relative to thematerial. This makes the present process involved interms of operational variables. Many parameters have tobe defined to conduct this operation. Among the princi-pal ones are spindle speed (tool rotational velocity), feedrate (workpiece velocity), diameter of the tool, helixangle, radial depth of cut (RDC), axial depth of cut(ADC), rake angle, clearance angle and number of flutes.These variables conjointly with tool and workpiecematerial define the state of cutting, which controls theprocess parameters. The latter include tool wear, toollife, surface finish, etc. The forces that are developedduring the milling process, can directly or indirectlymeasure/estimate such process parameters. In general,excessive cutting forces result in low product qualitywhile small cutting forces often indicate low machiningefficiency [1]. Thus, controlling these forces is of para-mount importance.

    The majority of milling operations have been carriedout based on cutting conditions determined from pre-vious experience and/or existing machining data. On the

  • 8/3/2019 Selecting an Artificial Neural Network for Efficient Modeling And

    2/12

    664 J.F. Briceno et al. / International Journal of Machine Tools & Manufacture 42 (2002) 663674

    other hand, researchers have been trying to develop

    mathematical models that would predict the cutting

    forces based on the geometry and physical character-

    istics of the process. Such prediction could then be used

    to optimize the process. However, due to its complexity,the milling process still represents a challenge to the

    modeling and simulation research effort. In fact, most ofthe research work reported in this regard, which is based

    on either analytical or semi-empirical approaches, has in

    general shown only limited levels of accuracy and/orgenerality.

    In the present paper, a different approach that is based

    on advanced artificial intelligence techniques isimplemented and tested. More specifically two differentneural networks are used to predict the forces developed

    during End milling. The networks are then compared andthe best network is selected based on certain criteria.

    2. Literature review

    This relatively new methodology of Artificial NeuralNetwork (ANN), inspired by biological nervous systems,

    has found application in many real-world problem solv-

    ing. One of the first engineering applications wasreported by Minsky and Papert developing perceptrons

    in 1969. Then this field stayed dormant until about 1986when the PDP group comprising Rumelhart andMcClelland [2] published a two-volume book on explo-

    rations in the microstructure of cognition. It is only in

    the past few years that this methodology was

    implemented in metal-cutting operations. In [3], a feed-forward neural network algorithm is implemented to pre-dict flank wear in orthogonal turning. In this case, feedrate, cutting speed and force ratio are used as inputs. Liu

    and Wang [4] also propose a back propagation (BP)

    ANN for on-line modeling of the milling system. How-

    ever, this study has several limitations, the most

    important of which is the use of a single machining para-

    meter as the variable input. In [5], a more efficient modelis created using BP ANN (using LevenbergMarquardtapproach). In this case, three inputs are considered withdifferent levels for each parameter. This approach has

    the disadvantage of requiring too many experiments to

    train the ANN. This, in terms of Industrial usability, is

    unattractive and expensive.

    Radial Basis Networks (RBN), a neural network

    architecture different from multi-layer BP ANN, havebeen used mainly for pattern recognition. However,

    recent studies have indicated that this important network

    can be successfully used as a function modeler as well.

    Cook and Chiu [6] used a radial basis network as a

    framework to establish some network improvementsconsidering a time series model of a manufacturing pro-

    cess. Cheng and Lin [7] used three ANNs to estimate

    bending angles formed by laser. The RBN showed to be

    superior to the other models. Elanayar and Shin [8] util-

    ized RBN to predict tool wear based on certain machin-

    ing conditions. A more general representation of the

    milling process cannot be found in the literature. In

    addition, no work has been conducted yet to evaluateand compare different artificial neural networks used to

    model the milling process.

    3. Artificial neural network models of the milling

    process

    In the current work, two supervised neural networks

    for modeling the milling process are compared. The firstone is a back propagation neural network (BP) with log-

    sigmoid transfer functions in hidden layers and lineartransfer function in the output layer; the second is a rad-

    ial basis network (RBN) with Gaussian activation func-

    tions. The first ANN is very popular, especially in thearea of manufacturing modeling, as its design and oper-

    ation are relatively simple. The radial basis network has

    some additional advantages such as rapid convergence

    and less error. In particular, most commonly used RBNs

    involve fixed basis functions with linearly appearingunknown parameters in the output layer. In contrast,multi-layer BP ANNs involve adjustable basis functions.

    That result in nonlinearly appearing unknown para-

    meters. It is commonly known that linearity in para-

    meters in RBN allow the use of least squares error based

    updating schemes that have faster convergence than the

    gradient-descent methods used to update the nonlinear

    parameters of multi-layer BP ANN. On the other hand,it is also known that the use of fixed basis functions inRBN results in exponential complexity in terms of the

    number of parameters, while adjustable basis functions

    of BP ANN can lead to much less complexity in termsof the number of parameters or network size [9]. How-

    ever, in practice, the number of parameters in RBN starts

    becoming unmanageably large only when the number of

    input features increases beyond about 10 or 20, which

    is not the case in our study. Hence, the use of RBN was

    practically possible for our problem. MatLab NeuralNetwork Tool Box was used as a platform to create

    the networks.

    3.1. Back-propagation neural network (BPNN)

    Since the objective is to evolve a model that relatesselected inputs with outputs, BPNN constitutes an excel-

    lent tool to approximate such function. The general net-

    work topology is shown in Fig. 2. This network is com-

    posed of several neurons or processing elements (PE)

    operating in parallel. The PEs are arranged in differentsections or layers. These structures include: an input

    layer, hidden layer(s) and an output layer. Each layer is

    connected to other layers through the weight lines that

  • 8/3/2019 Selecting an Artificial Neural Network for Efficient Modeling And

    3/12

    665J.F. Briceno et al. / International Journal of Machine Tools & Manufacture 42 (2002) 663674

    Fig. 2. Back-propagation network topology.

    come from each PE. The architecture of each PE is

    shown is Fig. 3. In general terms, the operation of thistype of network can be described in terms of two major

    phases: The feed-forward phase and the back-propa-

    gation phase.

    3.1.1. Feed-forward phase

    The input patterns are represented by the input PEs.

    Here no calculation is made. The following set of neu-

    rons are found in the hidden layer(s). Form the ith input

    PE the information is conducted to the jth PE in hidden

    layer through the weight Wij. As depicted in Fig. 3, the

    incoming data, in such element, is represented by

    aj n

    i 0

    WijIi (1)

    where, aj is the linear combination of each Ii multiplied

    by Wij. is the value used in the activation function; Ii is

    the ith input; Wij is the weight value from the ith input

    PE to the jth hidden PE; n is the number of incoming

    information to the jth PE; aj is the value fed to thesquashing function which gives the output of the jth PE

    Fig. 3. Architecture of an individual PE for BP.

    to the next layer(s). The output of this element is

    given by

    Yj SFl(aj) (2)

    where, Yj is the output value of the jth element; SFl is

    the squashing function (or activation function) of the lth

    hidden layer.In this paper, the squashing functions used in the hid-

    den and output layers are log-sigmoid transfer functionand linear transfer function respectively. The value of Yjis propagated through each further layer until the output

    is generated.

    3.1.2. Back-propagation phase

    In this phase the learning process is conducted. In gen-

    eral terms, the implementation of BP consists of updat-

    ing the network weights in the direction in which the

    performance function decreases most rapidly. Once the

    output (Yj) is calculated, it is compared with the targetvalue (tj). Then the following error is computed:

    ej 1

    2(tjYj)

    2 (3)

    This error ej corresponds to just one output PE. There-fore the overall error (E vector) is expressed by

    E (e1,,ej,ek) (4)

    where k is the number of outputs.The error is then transmitted backwards from the out-

    put layer to the input layer. The connection weights are

    updated by each PE, leading the network to converge.Several techniques can be used to conduct this back-

    propagation. One of the most widely used is the Leven-

    bergMarquardt technique. This technique approximatesthe Hessian matrix with the product of the Jacobian

    matrix and its transpose. In this way, the weight updates

    is based on the following equation:

    Wnewij Woldij

    JT

    JTJ I(5)

    where, Wnewij Corrected weight for jth PE coming fromthe previous layer, Woldij Previous weight for jth PE

    from previous layer, J Jacobian matrix containing the

    first derivatives of the network errors with respect to thenetwork weights and error signals for the ith pattern,

    Scalar factor (when equal to zero, the method iscalled second order Newtons, while when set to a largenumber, it is called gradient descent with small step

    size), Error signal for the jth PE.This network offers a good generalization method-

    ology and a fast convergence using the LevenbergMar-quardt algorithm. In the same way, regularization is usedto improve generalization through the use of automated

    regularization based on Bayesian framework. For this

    particular case, since the size of the data is relatively

  • 8/3/2019 Selecting an Artificial Neural Network for Efficient Modeling And

    4/12

    666 J.F. Briceno et al. / International Journal of Machine Tools & Manufacture 42 (2002) 663674

    small and based on Whites theorem [10] (which statesthat one layer with non-linear activation functions issufficient to map any non-inear functional relationshipwith a reasonable level of accuracy), a single hiddenlayer neural network was utilized and the number ofweights are kept around 3/4 of the number of experi-

    ments, actually:Number of weights(Number of experiments)(3/4)

    Normally this factor is about 1/10 , but due the small

    size of the data in this particular case a factor of 3/4 wasused, which still resulted in more data points than the

    number of unknown weights.

    The effect of topology is also studied by considering

    different cases. The topologies are varied by varying the

    number of neurons in hidden layer (n, in Fig. 2) between

    a lower limit of 2 and an upper limit of 3/4 of the totalnumber of experiments. The lower limit was selected

    based on the fact that one neuron in the hidden layer

    represents a model in which a linear relation is implied

    between the inputs and outputs. The following notation

    is used to describe the topology: 3.n.4; which means: 3

    inputs, n neurons in the hidden layer and 4 outputs.

    3.2. Radial basis network

    This neural network utilizes the Gaussian curve to

    map values. RBN works considerably well in function

    approximation. It is very fast in convergence and it is

    very simple to define in terms of a number of character-istic parameters.

    Radial basis network (RBN) or radial basis function

    network is a two layer fully interconnected neural net-work. It has two general characteristics: First, it mayrequire more neurons than the standard feed-forward BP

    networks. Second, it can be designed in a fraction of the

    time that it takes to train the aforementioned BP.

    A typical RBN is shown in Fig. 4. The network has

    Fig. 4. Radial basis network architecture.

    n inputs and k outputs. The first layer is connected withthe second or internal layer by weights that come from

    the input elements and the bias element. Weights from

    internal layer to outputs are also defined. Each elementin the internal layer receives an input pattern vector andcompares it with the mean weight vector that connects

    the input with second layer. The weight vector deter-mines the position of the center of the radial hidden

    element in the input space. Here, the activation function

    is similar to a Gaussian density function. This functionis defined as follows:

    Yki e

    h

    (uihaih)2C

    V2 (6)

    Here Yki is the response of the ith element in the hid-den layer. The weights uih define the mean value vectorassociated with each hidden PE, aih represent the inputs.

    The parameter V is the factor that shapes the form of

    the squashing function and is called spread factor; C isa constant. The PEs architecture of the hidden layer can

    be seen in Fig. 5.

    Finally, the connection weights between the second

    layer and the output layer is multiplied by the output of

    the internal elements (linear summation function), givingthe output value to be compared with the target vectors.

    zkj p

    i 0

    WijYkj (7)

    Radial basis network is a very efficient network whenfunction approximation is needed. This artificial neural

    network has the following characteristics:

    1. it is very fast in comparison to back-propagation;

    2. it has the ability of representing nonlinear functions;

    3. it does not experience local minima problems of

    back-propagation.

    RBN is being used for an increasing number of appli-cations, proportioning a very helpful modeling tool.

    Fig. 5. Radial basis neuron.

  • 8/3/2019 Selecting an Artificial Neural Network for Efficient Modeling And

    5/12

    667J.F. Briceno et al. / International Journal of Machine Tools & Manufacture 42 (2002) 663674

    In summary, two parameters need to be defined.Spread factor and goal factor. The spread factor V, has

    to be specified depending on the particular case in hand.It has to be smaller than the highest limit of the input

    data and larger than the lowest limit [11]. Based on this,and assuming that all the training data (as will be

    explained in a future paper) is mapped between 0 and1, three values to be considered are: 0.2, 0.5 and 0.8.

    The goal factor value is set to zero, since error is a

    decisive factor in this study.

    4. Experimental data for training the ANN models

    4.1. Experimental set-up

    The three components of the cutting force are meas-

    ured using a Kistler 9257B dynamometer. These were

    sampled at 2500 Hz for 10 s each and have been stored

    in files in a spreadsheet format. The machine tool usedfor all the experiments in this work is a FADAL VMC-

    3016L 4-Axis CNC milling machine.

    The experiments were conducted using a 1/4 in. diam-

    eter, 2-flute, HSS, Do-All end mill. The tool geometryparameters were a 14 rake angle, a 16 primary clear-ance angle, and a 37.5 helix angle. This is a tooldesigned specifically for non-ferrous metals like alumi-num and has a higher rake angle. The data acquisition

    package used was LabVIEW. The set up can be seen in

    Fig. 6.

    4.2. Design of experiments

    Design of experiments (DOE) is utilized here to deter-

    mine the optimum number of experiments needed to suc-

    cessfully model the process within the required accuracy.

    This technique came into picture as a link between stat-

    istical design and engineering knowledge. Literature on

    experimental design is numerous and this paper is not

    intended to cover aspects of experimental design tech-

    niques and detailed information can be found in Ross

    Fig. 6. Experimental set-up.

    [12]. Experimental design is made up of three stages:

    First, system design. In this phase, the flat end millingexperimental set-up is built including the dynamometer

    to measure the required forces. Second, parameter

    design. Here the variables that are involved in the pro-cess are valuated. In this particular case, orthogonal

    arrays are used to host the variations of process para-meters. Third, tolerance design which is not considered

    here, as this study aims at comparing two artificial neuralnetworks. The present work constitutes a first step andeventually further enhancements and refinements wouldbe needed.

    4.3. Set of experiments

    As noted earlier, there is a number of machining para-meters that significantly affects the milling process. Ofthese parameters, spindle speed, feed rate and depth of

    cut have been varied in current experiments and cutting

    force variation with time recorded. Other parameters

    such as tool diameter, rake angle, etc. are kept constant

    for the scope of this study. In fact, the selected para-

    meters are very critical in the flat-end milling processand should provide a basis for meaningful results for

    comparing the two models.In order to select the data to be used in the training

    phase, several experimental sets were designed. All these

    sets represent states or points in a 3D space, since only

    3 parameters were selected.

    4.3.1. First set of experiments

    The first set consists of 27 experiments. Three valueswere selected for each parameter. This approach gives33=27 experiments (full factorial). The range of valueswere selected based on recommendations given by [13].

    Next, DOE is applied. Since no sensibility relation is

    known at this stage, equally spaced division is used in

    order to set the particular values. This results in the fol-

    lowing:

    feed rate (mm/min): range 100200 and selectedvalues: 100, 150 and 200;

    spindle speed (rpm): range 6001800 and selectedvalues: 600, 1200 and 1800;

    radial depth of cut (%D): range 0100 and selectedvalues: 25, 62.5 and 100.

    The corresponding space is shown in Fig. 7.

    4.3.2. Second set of experiments

    For this set, the work-space is divided as shown in

    Fig. 8 (bold points are in different RDC planes). Again

    an equally spaced division is used. In this set, the num-ber of states in the work-space has increased. In fact, the

    total number of experiments (full factorial) is then

    53=125, 27 of which are already in the first set. This

  • 8/3/2019 Selecting an Artificial Neural Network for Efficient Modeling And

    6/12

    668 J.F. Briceno et al. / International Journal of Machine Tools & Manufacture 42 (2002) 663674

    Fig. 7. First set of experiments.

    Fig. 8. Second set of experiments.

    reduces the number of additional experiments towards a

    full factorial to 98. As mentioned above, the second set

    is represented by equally spaced states inside the definedrange and subsequent models partially cover the rest of

    the space. The second set has in total 35 experiments,

    27 from the first set plus an additional 8 experiments(see Fig. 8).

    It is important to point out that these eight additional

    experiments are in different RDC planes (radial depth ofcut) from the ones used in the previous set. Again, they

    are equally spaced.

    4.3.3. Third set of experiments

    This set consists of 12 additional experiments (see

    Fig. 9), which results in a total of 47 experiments.

    4.3.4. Fourth set of experiments

    Eighteen additional points inside the range, as shown

    in Fig. 10, are considered resulting in a total of 65

    experiments.In summary, four different experimental sets are

    defined to be used in the training phase. Each set is usedto train each one of the ANN models.

    Fig. 9. Third set of experiments.

    Fig. 10. Fourth set of experiments.

    4.3.5. Validation set

    This set (made of 20 new experiments) is used to com-

    pare the measured values with the ones predicted from

    the ANNs. These experiments will also support the

    determination of the optimum number of representative

    training data.

    All experiments were performed using the above-

    mentioned milling machine. Forces in X-, Y- and Z-direc-

    tion were measured and are found to be periodic.

    4.4. Data pre-processing

    After collecting the force components, the resultantforce R was calculated using the following equation:

    R Fx2 Fy2 Fz2 (8)

    The maximum (MAX), minimum (MIN), mean

    (MEAN) and standard deviation (STDV) values of this

    resultant force are calculated for each experiment (as

    they represent important characteristics of a continuous

    force pattern). Next, the data is normalized in order tomake it suitable for the training process [11]. This was

    done by mapping each term to a value between 0 and 1

    using the following formula:

  • 8/3/2019 Selecting an Artificial Neural Network for Efficient Modeling And

    7/12

    669J.F. Briceno et al. / International Journal of Machine Tools & Manufacture 42 (2002) 663674

    Fig. 11. General ANN topology.

    N (R

    Rmin)

    (Nmax

    Nmin)(RmaxRmin)

    Nmin (9)

    where, N: normalized value of the real variable; Nmin andNmax: minimum and maximum values of normalization,

    respectively; R: real value of the variable; Rmin and Rmax:minimum and maximum values of the real variable,

    respectively.

    This normalized data was utilized as the inputs

    (machining conditions) and outputs (characteristics of

    the resultant force) to train the ANN. In other words,

    two vectors are formed in order to train the neural net-

    work (see Fig. 11):

    Input=[feed rate; spindle speed; radial depth of cut];Output=[MAX; MIN; MEAN;STDV];

    Table 1

    Linear regression for training phase (using BP)

    Back propa

    Set W R

    MAX MIN MEAN STDV

    1st 14 0.987 0.981 0.984 0.992

    21 0.995 0.983 0.993 0.9952nd 14 0.973 0.965 0.974 0.973

    21 0.983 0.97 0.981 0.974

    28 0.984 0.985 0.986 0.975

    3rd 14 0.966 0.917 0.974 0.951

    21 0.973 0.922 0.977 0.958

    28 0.973 0.952 0.987 0.959

    35 0.977 0.955 0.988 0.968

    4th 14 0.958 0.912 0.974 0.946

    21 0.967 0.924 0.978 0.953

    28 0.968 0.945 0.983 0.953

    35 0.969 0.933 0.987 0.955

    42 0.97 0.951 0.986 0.956

    49 0.97 0.951 0.988 0.955

    Table 2

    Error values of BP network, 1st experimental set, topology 3.2.4

    MAX MIN MEAN STDV

    0.1293 0.0353 0.5958 0.0832

    0.4576 0.048 0.1146 0.1396

    0.7888 0.088 0.0775 0.23740.6513 0.0241 0.2677 0.2231

    0.002 0.0002 0.3962 0.0399

    0.0837 0.0356 0.2639 0.0494

    0.6995 0.2055 0.0747 0.2582

    0.5869 0.4021 0.0476 0.164

    0.3958 0.2684 0.167 0.1484

    0.4416 0.0148 0.4065 0.0945

    0.5721 0.0447 0.1487 0.2264

    0.1314 0.473 0.1748 0.1092

    0.721 0.178 0.0093 0.1913

    0.0028 0.0217 0.4942 0.0382

    0.5879 1.4483 0.9077 0.109

    0.0794 0.0158 0.3518 0.0318

    0.4278 0.2741 0.205 0.2454

    0.6645 0.0548 0.0371 0.21520.4569 0.2025 0.1407 0.1116

    0.9767 0.0218 0.1056 0.3863

    Table 3

    Values to report

    MAXIMUM MINIMUM MEAN STDV

    0.4428 0.1928 0.2493 0.1551 Mean

    0.2853 0.3261 0.2235 0.0924 Stdv

    5. Results

    5.1. Training results

    Each experimental set (except the validation set) is

    used to train each network. This training is repeated for

    each topology. The performance is measured by the lin-

    ear regression (R) of each output. With this analysis it

    is possible to determine the response of the network with

    respect to the targets. A value of 1 indicates that thenetwork is perfectly simulating the training set while 0

    means the opposite. For all the cases in this study, the

    value of R (for all output sets) is shown in Table 1.

    The case of RBN showed a perfect fitting pattern (R=1for all the cases) as expected since the goal error factor

    is set to zero.

    5.2. Validation results of the BP model and RBN

    model

    For each network, the difference between the realvalue and the predicted value is calculated producing a

    matrix of 20 by 4 elements, meaning 20 experiments of

    validation (rows) and 4 outputs parameters (columns).

  • 8/3/2019 Selecting an Artificial Neural Network for Efficient Modeling And

    8/12

    670 J.F. Briceno et al. / International Journal of Machine Tools & Manufacture 42 (2002) 663674

    Table 4

    Results from BP

    ERROR (mean and stdv) (real-predicted)102 NTopology MAXIMUM MINIMUM MEAN STDV

    1st Set 27 Experiments

    3.2.4 (w=14) 0.4428 0.1928 0.2493 0.1551 Mean0.2853 0.3261 0.2235 0.0924 Stdv

    3.3.4 (w=21) 0.4328 0.2161 0.2168 0.1672 Mean0.2346 0.3018 0.1636 0.0874 Stdv

    2nd Set 35 Experiments

    3.2.4 (w=14) 0.3396 0.13 0.2128 0.1073 Mean0.2053 0.3133 0.2173 0.0896 Stdv

    3.3.4 (w=21) 0.2726 0.1194 0.1332 0.1082 Mean0.1802 0.2981 0.1711 0.0628 Stdv

    3.4.4 (w=28) 0.2554 0.133 0.1404 0.1035 Mean0.1783 0.2325 0.1497 0.0591 Stdv

    3rd Set 47 Experiments

    3.2.4 (w=14) 0.3524 0.1127 0.2078 0.1009 Mean0.2144 0.2194 0.1887 0.0844 Stdv

    3.3.4 (w=21) 0.2293 0.1128 0.1309 0.09 Mean

    0.2002 0.1855 0.1397 0.0485 Stdv3.4.4 (w=28) 0.253 0.1215 0.079 0.0843 Mean

    0.1815 0.1132 0.0945 0.0471 Stdv

    3.5.4 (w=35) 0.2394 0.1309 0.0906 0.0769 Mean0.2018 0.1092 0.1131 0.0478 Stdv

    4th Set 65 Experiments

    3.2.4 (w=14) 0.2891 0.0859 0.2262 0.079 Mean0.2028 0.212 0.1934 0.0628 Stdv

    3.3.4 (w=21) 0.1905 0.0794 0.138 0.0569 Mean0.187 0.1981 0.1469 0.0424 Stdv

    3.4.4 (w=28) 0.1884 0.0998 0.1255 0.0517 Mean0.1742 0.1167 0.1321 0.0429 Stdv

    3.5.4 (w=35) 0.1923 0.1307 0.0978 0.0515 Mean0.1661 0.1147 0.1098 0.0332 Stdv

    3.6.4 (w=42) 0.1935 0.0809 0.0974 0.0551 Mean

    0.1733 0.091 0.0797 0.0394 Stdv3.7.4 (w=49) 0.1988 0.086 0.0892 0.0533 Mean

    0.1715 0.0893 0.0624 0.0388 Stdv

    For each column, the mean and standard deviation are

    calculated. These two values represent the mean error

    and standard deviation of each output element respect-

    ively. In this way, a vector of two elements is used to

    make the comparison. To illustrate the calculations, anexample is presented. For back-propagation network and

    using the 1st experimental set with topology 3.2.4, the

    error is calculated as follows:

    eij |mijpij|i:1 20,j:1 4 (10)

    where, i refers to experiment number, and j refers to thejth output of the network; eij is the error value of the ith

    machining condition state for the jth output; mij is themeasured value of the ith machining condition state for

    the jth output; and pij is the predicted value of the ith

    machining condition state for the jth outputThe calculated errors are shown in Table 2. From this

    table the mean and standard deviation are calculated for

    each column. The reported results are shown in Table

    3. This was done for each model and for each topology

    (in BP) as well as combination (in RB network). The

    results are shown in Table 4 (BP) and Table 5 (RBN).

    6. Methodology used to compare the two ArtificialNeural Networks

    The selection of the corresponding best network iscarried out in terms of accuracy and efficiency. Thelatest term is measured by selecting a minimum number

    of training experiments that results in a sufficientlyaccurate model. It is known that the larger is the training

    set, the more accurate the evolved model is. Conse-quently, a cost function is needed to evaluate the simul-

    taneous influence of training experiments size and mod-els accuracy.

  • 8/3/2019 Selecting an Artificial Neural Network for Efficient Modeling And

    9/12

    671J.F. Briceno et al. / International Journal of Machine Tools & Manufacture 42 (2002) 663674

    Table 5

    Results from RBN

    ERROR (mean and stdv) (real-predicted)102 NSpread MAXIMUM MINIMUM MEAN STDV

    1st Set 27 Experiments

    0.2 Mean 0.3758 0.3143 0.2954 0.1911STDV 0.2893 0.2858 0.1277 0.1071

    0.5 Mean 0.5111 0.2909 0.1925 0.1685

    STDV 0.28 0.2207 0.193 0.0819

    0.8 Mean 0.7133 0.2084 0.8924 0.1149

    STDV 0.5511 0.2551 0.3299 0.1062

    2nd Set 35 Experiments

    0.2 Mean 0.224 0.1147 0.1497 0.09

    STDV 0.1981 0.1621 0.0895 0.0825

    0.5 Mean 0.2602 0.2183 0.1813 0.0892

    STDV 0.2129 0.1809 0.1537 0.0793

    0.8 Mean 0.2532 0.2048 0.1754 0.0917

    STDV 0.2188 0.1717 0.1452 0.0754

    3rd Set 47 Experiments

    0.2 Mean 0.2794 0.0753 0.1379 0.0928

    STDV 0.263 0.0989 0.1046 0.09820.5 Mean 0.4704 0.2839 0.2028 0.1197

    STDV 0.4654 0.2399 0.1552 0.1107

    0.8 Mean 0.5866 0.2995 0.2051 0.1346

    STDV 0.5228 0.235 0.1761 0.1151

    4th Set 65 Experiments

    0.2 Mean 0.2272 0.0337 0.1506 0.0621

    STDV 0.2402 0.0401 0.1068 0.0586

    0.5 Mean 2.2885 0.7389 0.3053 0.8367

    STDV 2.5721 0.8868 0.2742 0.9921

    0.8 Mean 4.7221 1.4937 0.3416 1.8317

    STDV 5.2183 1.7041 0.2734 2.0805

    6.1. Establishment of the cost function

    The cost function (C) is set to relate the followingparameters:

    1. number of experiments (NE);

    2. error of prediction in terms of two important vari-

    ables:

    3. Maximum resultant force (EMAX);4. Mean resultant force (EMEAN).

    Therefore, the overall cost function is given by

    C1NE

    N

    2EMAXE

    3EMEAN

    E(11)

    where i (i=1,2,3) are the weights of each equation term.N is the maximum number of possible experiments (in

    this case 125, which represent the full factorialcondition), E is the maximum allowed error (which is

    set to 30 N). The last value was selected based on the

    fact that this error constitutes a relatively small valuecompared to the magnitude of the forces developed dur-

    ing milling experiments conducted in here.

    Eq. (11) shows that the closer NE is to N, the higher

    the value of C. This is compensated by the fact that theerror would be much smaller than E. On the other hand,

    using a small NE, the cost is reduced by the first termbut augmented by the last two terms, since the accuracy

    would be compromised.

    In addition, the equation is set to be unitless in order

    to provide a fair base of comparison.

    The experimental set that gives the least cost is the

    selected set to be utilized for the particular ANN model.

    Then, the two networks are compared.

    The weights of each parameter (Eq. (11)) are selected

    based on the needs of this study. Previous studies have

    been criticized for the number of experiments required

    in the training phase. Normally the use of artificial neuralnetworks requires a large number of experiments for

    training. For this very reason, the heaviest weight will

    be 1 (term that determines the heaviness of the numberof experiments in the cost function), while 2 & 3 areset to smaller values and with equal values. The reason

    for choosing equal values for 2 & 3 is that EMAX andEMEAN represent forces that have the same value in terms

    of cost relevance. The maximum force is important in

    this study due the great significance that this particularvariable has in tool breakage while the mean force indi-

  • 8/3/2019 Selecting an Artificial Neural Network for Efficient Modeling And

    10/12

    672 J.F. Briceno et al. / International Journal of Machine Tools & Manufacture 42 (2002) 663674

    Table 6

    Cost Values (BP)

    Sets

    1st 2nd 3rd 4th

    Cost (=0.5)

    W 14 1.3263 1.14466667 1.23446667 1.274833

    21 1.25546667 0.90033333 0.90113333 0.9635

    28 N/A 0.88366667 0.85413333 0.939167

    35 N/A N/A 0.8508 0.8995

    42 N/A N/A N/A 0.900833

    49 N/A N/A N/A 0.896

    Cost (=0.7)W 14 1.7877 1.51293333 1.60793333 1.618367

    21 1.68853333 1.17086667 1.14126667 1.1825

    28 N/A 1.14753333 1.07546667 1.148433

    35 N/A N/A 1.0708 1.0929

    42 N/A N/A N/A 1.094767

    49 N/A N/A N/A 1.088

    Cost (=0.2)

    W 14 0.6342 0.59226667 0.67426667 0.75953321 0.60586667 0.49453333 0.54093333 0.635

    28 N/A 0.48786667 0.52213333 0.625267

    35 N/A N/A 0.5208 0.6094

    42 N/A N/A N/A 0.609933

    49 N/A N/A N/A 0.608

    cates the average force experienced during a cutting

    cycle, giving indication of the total power used. Based

    on the above, the weights are selected as follows:1=0.8 and 2=3=0.5 (initially).Since the effect of EMAX and EMEAN in the cost func-

    tion is important, a sensibility analysis is carried to seehow the value of the cost function varies when the

    weights of these terms (2, 3) are set to different values.It is necessary to mention that these values are limitedby 1=0.8 (the upper limit; since this factor is the largestweight in Eq. (11)) and 0 (the lowest limit; which corre-

    sponds to zero participation). Therefore, three values are

    considered for 2 and 3: 0.5, 0.7 and 0.2.The cost value is then calculated based on Eq. (11)

    and using the values of EMAX and EMEAN from Tables 4

    and 5 for BP and RBN respectively. The cost is calcu-

    lated for all the values of 2 and 3. These two valuesare represented by . The results are shown in Table 6

    (BP) and 7 (RBN).

    Table 7

    Cost values (RBN) results correspond to spread factor=0.2

    Cost values

    Sets Cost (0.5) Cost (0.7) Cost (0.2)

    1st 1.29146667 1.73893333 0.62026667

    2nd 0.84683333 1.09596667 0.47313333

    3rd 0.9963 1.2745 0.579

    4th 1.04566667 1.29753333 0.66786667

    7. Discussion of results

    For the training phase, Table 1 shows the effective-

    ness of the selected ANN architecture for BP. All R-

    values are over 0.9. This table depicts that the more neu-rons in the hidden layer (high W), the better the represen-

    tation (high R). By the same token, the increased numberof experiments results in a reduction in the value of R

    which is compensated by the addition of more PEs in

    the hidden layer. This tendency was expected becauseof the fact that when having a larger training set, more

    neurons are needed to establish a good modeling.

    Since all R-values are sufficiently high, it is possibleto conclude that any of these W combinations in any set

    can be used to successfully train the neural network. The

    same was applied for RBN where all R-values are 1.Furthermore, this indicates that the methodology of DOE

    can be successfully applied with good results.

    From these results, it is possible to state that based on

    the training performance, the RBN is better than the BP.

    Table 4 depicts that the 4th set produces smaller errors

    than the 1st set. This is because the former set contains

    more experiments (more information about the process)

    than the latter one. This table also shows the effect of

    increasing the number of neurons or PEs in the hiddenlayer. The more neurons, the more accurate the experi-

    ment set. The magnitudes of the errors are relatively

    small if compared with the forces developed during mill-

    ing, which in this study vary between 200 and 1000 N.

    The results given by the radial basis network indicates

    that the smallest errors are reached when the spread

    value is equal to 0.2 for all sets. For this reason the costcalculation and therefore the model comparison is con-ducted using this particular value.

    Tables 6 and 7 show interesting results pertaining to

    the cost values for BP and RBN, respectively. Each

    model indicates different lowest cost sets. For BP the

    set with lowest cost is the third set with 0.85 and 1. 07

    for =0.5 and 0.7, respectively. These two values are atW=35. While for =0.2, the lowest cost corresponds tothe second set, W=28. This trend is represented in Fig.12, where the cost vs set is plotted.

    This tendency is due to the lower weight given to the

    error term (=0.2), compared with the weight given tothe number of experiments (NE). From these results,

    =0.5 can be considered as the base of comparison sincethis value does not underestimate the participation of the

    error part on the cost function. Therefore, the best modelfrom this network is the one with 47 experiments with

    a cost value of 0.8508 and the topology of 3 inputs, 5

    neurons in the hidden layer and 4 outputs.

    In the radial basis network, the 2nd set gives the low-

    est cost regardless of the value of(see Fig. 13). Again,=0.5 is selected for comparison purpose. For this net-work, the best model is the one with 35 experiments, a

    cost value of 0.8468, and a spread value of 0.2.

  • 8/3/2019 Selecting an Artificial Neural Network for Efficient Modeling And

    11/12

    673J.F. Briceno et al. / International Journal of Machine Tools & Manufacture 42 (2002) 663674

    Fig. 12. Cost vs sets (back-propagation network, using the lowest cost of each experimental set).

    Fig. 13. Cost vs sets (RB network).

    Based on these results (BP cost=0.8508 and 47 experi-ments; RBN cost=0.8468 and 35 experiments) and thefact that the radial basis network is for this particular

    case about 3 times faster to train than back-propagation

    network, the model that best represents the functional

    relation between the considered milling parameters is theradial basis network.

    The selected network, not only gave the lowest cost,

    but it could be trained with fewer experiments and is

    much faster. This indicates that for this particular case,

    the RBN is more efficient than BP.

    8. Conclusions and future work

    In this paper, two supervised neural networks are used

    to successfully estimate the forces developed duringmilling process. Design of experiments and specificallyorthogonal arrangement is used to select the experiments

    to perform and establish different sets that are considered

    in each ANN model. DOE contributed to increasing the

    efficiency in the system by drastically reducing theamount of experimental data needed for successful train-

    ing.

    Based on the results of this study, it is possible to

    conclude that having 5 values (equally spaced) of theselected milling parameters and applying orthogonal

    arrangement, 35 experiments (out of 125) are enough to

    train and evolve an accurate ANN model of the end mill-

    ing process. In back-propagation networks, the use of a

    single hidden layer showed to work sufficiently well forthe process in consideration. However, it is shown thatradial basis network is superior to back-propagation net-

    work in predicting the milling forces, when evaluated in

    terms of a cost function that combines costs of experi-

    ments with accuracy.

    In this study, a cost function is defined based on spe-cific needs. This fitness function can be refined in thefuture in order to represent more extensively the charac-

    teristics of the milling process. In the same way, it is

  • 8/3/2019 Selecting an Artificial Neural Network for Efficient Modeling And

    12/12

    674 J.F. Briceno et al. / International Journal of Machine Tools & Manufacture 42 (2002) 663674

    possible to design a more systematic methodology to sel-

    ect the spread factor in radial basis network. This could

    increase the accuracy of the model.

    References

    [1] H.Y. Feng, N. Su, A mechanistic citting force model for ball-end milling, Journal of Manufacturing Science and Engineering

    November (1998).

    [2] D.E. Rumelhart, J.L. McClelland, PDP Research Group Parallel

    Distributed Processing: Explorations in the Microstructure of

    Cognition 2 vols., in: MIT Press, Cambridge, MA, 1986.

    [3] Q. Liu, Y. Altintas, On-line monitoring of flank wear in turning

    with multi-layered feed-forward neural network, International

    Journal of Machine Tools & Manufacture 39 (1999) 19451959.

    [4] Y. Liu, C. Wang, Neural network based adaptive control and

    optimisation in the milling process, International Journal of

    Advanced Manufacturing Technology 15 (11) (1999) 791795.

    [5] V. Tandon, Closing the gap between CAD/CAM and optimized

    CNC end milling, MSME Thesis, Purdue School of Engineer-

    ing & Technology, 2000

    [6] D. Cook, C. Chiu, Combining a radial basis neural network with

    time series analysis techniques to predict manufacturing process

    parameters, Applied Artificial Intelligence 9 (6) (1995) 623631.

    [7] P.J. Cheng, S.C. Lin, Using neural networks to predict bending

    angle of sheet metal formed by laser, International Journal of

    Machine Tools & Manufacturing 40 (1999) 11851197.

    [8] S. Elanayar, Y.C. Shin, Design and implementation of tool wear

    monitoring with radial basis function neural networks, in: Pro-ceedings of the American Control Conference, Proceedings of

    the 1995 American Control Conference. Part 3 (of 6), 1995, pp.

    17221726.

    [9] A.R. Barron, Neural net approximation, in: Proceedings of the

    Seventh Yale Workshop on Adaptive and Learning Systems,

    1992, pp. 6872.

    [10] R.C. Eberhart, P. Simpson, R. Dobbins, Computational Intelli-

    gence PC Tools, in: AP Professional, New York, 1996.

    [11] H. Demuth, M. Beale, Neural Network Toolbox v3 Users Guide,

    in: The MathWorks Inc, USA, 1999.

    [12] P.J. Ross, Taguchi techniques for quality engineering, in:

    McGraw Hill, New York, 1988.

    [13] R.A. Walsh, McGraw-Hill machining and metalworking hand-

    book, in: McGraw-Hill, New York, 1994.