17 enachescumultidimension vi 17-1-6

Upload: rbdnn

Post on 08-Apr-2018

219 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/7/2019 17 EnachescuMultidimension VI 17-1-6

    1/6

    VI-17-1

    MULTIDIMENSIONAL FUNCTION APPROXIMATION USING NEURAL

    NETWORKS

    Enchescu Clin

    Petru Maior University of Targu Mures, ROMANIA

    [email protected]

    Abstract: Solving a problem with a neural network a primordial task is establishing thenetwork topology. Generally neural network topology determination is a complex problem

    and cannot be easily solved. When the number of trainable layers and processor units is too

    low, the network is not able to learn the proposed problem. When the number of layers andneurons is too high, the learning process becomes too slow. Learning from examples means

    being able to infer the functional dependence between input and output spaces X and Z, giventhe knowledge of the set of examples T. It means that, after we have learned Nexamples,

    when a new input variable x comes in, we need to be able to estimate, according to some

    criterion that we will specify, a corresponding value of z. From this point of view learning isequivalent to a function approximation.

    Key words: Neural networks, approximation, learning

    I. INTRODUCTION

    From 1986 the most popular neural network model is the Multi Layer Perceptron

    (MLP) and the most popular learning algorithm is the Back-propagation (BP) method [7]. In

    spite of the fact that the classical MLP networks have many advantageous properties, they

    have some disadvantages, too. The most important disadvantage is the slowness of the

    training procedure, caused by the high number of the trainable layers and the necessity of

    error back-propagation. The training process could be faster if the number of trainable layers

    can be diminished. That was the motivation for developing neural networks with a single

    trainable layer.

    Radial basis functions were first introduced in the solution of the real multivariate

    interpolation problem [10]. Broomhead and Lowe (1988) [1] were the first to exploit the use

    of radial basis functions in designing neural networks. Other major contributions to the

    theory, design and application of RBF networks include papers by Moody and Darken (1989)

    [8] and Poggio and Girosi (1990) [9].

    II. RBF NEURAL NETWORKS TOPOLOGY.

    RBF is a feed-forward neural network with an input layer (made up of source nodes:

    sensory units), a single hidden layer and an output layer [2]. The network is designed to

  • 8/7/2019 17 EnachescuMultidimension VI 17-1-6

    2/6

    VI-17-2

    perform a nonlinear mapping from the input space to the hidden space, followed by a linear

    mapping from the hidden space to the output space.

    The processor units of the hidden layer are different from the processor units of the

    MLP networks. The activation functions are radial basis functions (for example Gaussian

    functions). These functions generally have two parameters: the center and the width [3].

    The output layer is composed by processor units, which are creating normal, simple

    linear-weighted value, every unit producing an output. The network has a typical property: the

    value of weights between the input and the hidden layer is 1.

    The architecture of the RBF neural network is presented in Figure.1 [4].

    1x

    ix

    nx

    1g

    ig

    Kg

    1w

    iw

    Kw

    Figure 1:RBF neural network topology.

    If ( )nxx ,...,1=x is the input vector, ()g is the Radial Basis Function and ic is the centerparameter for the function corresponding to neuron i, then the output created by the network

    will be:

    =

    =K

    i

    iigwy1

    )(x ( )=

    =K

    i

    iigw1

    cx (1)

    Generally Gaussian function is used:

    2

    2

    2)( i

    i

    egi

    cx

    x

    = (2)

    where i is the scale parameter for function corresponding to neuron i.

    There are some methods to select the parameters ( ic , i ) of the activation function. If

    few training points are present, then all of them could be used as center parameter. In this case

    the number of the processor units in the hidden layer is equal with the number of trainingpoints. If the number of training points is high, then not all of them might be used. In this

  • 8/7/2019 17 EnachescuMultidimension VI 17-1-6

    3/6

    VI-17-3

    situation a single neuron for a group of similar training points can be considered. These

    groups of similar training points can be identified using clustering methods [5].

    III. LEANING STRATEGIES FOR RBF NEURAL NETWORKS

    The hidden layer of the RBF Neural Networks may be trained with a supervised learning

    algorithm. A descendent gradient-based algorithm can be considered. The aim is to establish

    the synaptic weights wi, i = 1,2,,Kof the network.

    Let

    ( ) NizzT in

    iii ,,2,1,,, K== RRxx (3)

    be the set of the training samples.

    A clustering algorithm is used on the points of the set T. The cluster centers ci,

    Ki ,...,1= are considered (in this way the number of the neurons in the hidden layer isK).

    Parameters Ri , Ki ,...,1= can be determined corresponding to the diameter of

    clusters. This step is not executed whenKis equal withN(K=N), because in this case ci = xi,

    Ni ,...,1= (every training point is a cluster center too and the value of the width parameters is

    i = 1/N).

    If Gaussian function is used as activation function, then at the lth step the globallearning error is

    =

    =N

    i

    iil yzN

    E1

    2)(1

    (4)

    where

    =

    =K

    j

    c

    jij

    ji

    ewy1

    2

    )(

    2

    2

    x

    , Ni ,...,1= (5)

    Let us to note:

    ,i

    i w

    E

    w

    = Ki ,...,1= (6)

    where is the learning rate andEis theglobal learning error.

    Weights updating is based on the following correction rule:

    iii www += , Ki ,...,1= (7)

    When the learning process is finished, Mpoints, which are not from the training set T,are randomly generated. The corresponding generalization erroris defined by the expression

    [5]:

  • 8/7/2019 17 EnachescuMultidimension VI 17-1-6

    4/6

    VI-17-4

    =

    =M

    i

    iig yzM

    E1

    2)(1

    (8)

    IV. APPROXIMATION AND INTERPOLATION WITH RBF NEURAL NETWORKS

    The interpolation problem, in its strict sense, may be stated as follows:

    Given a set ofN different points { }Nix pi ,...,1| =R and a corresponding set ofN realnumbers { }Nidi ,...,1| =R , find a function RR

    pF: that satisfies the interpolation

    condition [2], [6]:

    ( ) NidxF ii ,...,1, == (9)

    The RBF technique consists of choosing a functionFthat has the following form [10]:

    ( ) ( )=

    =N

    i

    ii xxgwxF1

    (10)

    where

    ( ){ }Nixxg i ,...,1| = (11)

    is a set ofNarbitrary radial basis functions. The known data points Nix pi ,...,1, =R are

    taken to be the centers of the radial basis functions.

    A RBF network is considered, with a single processor unit in the output layer, and N

    processor units in the hidden layer, where ( ){ }Nixxg i ,...,1| = is the set of the activationfunctions for the hidden processor units. The interpolation problem is reduced to the

    determination of weights (learning process) [3].

    In an overall fashion, the network represents a map from the p-dimensional inputspace to the single-dimensional output space, written as:

    RR ps : (12)

    The map s could be considered as a hypersurface 1+ pR . The surface is a

    multidimensional plot of the output as a function of the input.

    In a practical situation, the surface is unknown and the training data are usually

    affected by noise. Accordingly, the training phase and generalization phase of the learning

    process may be respectively viewed as follows [1], [4]:

    - The training phase constitutes the optimization of a fitting procedure for the surface , based on known data points presented to the network in the form of input-output

    examples.

    - The generalization phase is a synonymous with interpolation between the data points,with the interpolation being performed along the constrained surface generated by the fitting

    procedure as the optimal approximation to the true surface .

  • 8/7/2019 17 EnachescuMultidimension VI 17-1-6

    5/6

    VI-17-5

    V. NUMERICAL EXPERIMENTS.

    In this section some experiments and the obtained results are presented. Standard

    interpolation problems are considered. RBF neural networks are used for approximating

    functions.

    The generalized k-means clustering algorithm is used for data clustering and some

    comparisons are presented [3].

    Experiment:In order to study the properties of the RBF networks obtained as a theoreticalresult, we have implemented this type of neural network and studying the learning capabilities

    and the generalization capabilities. We have taken in consideration as target function, to be

    approximated, the following function:

    ( ) ( ) ( )yxyxff sincos10,,: 2 =RR (13)

    Fig. 2.: 400 training data, 10 learning epochs. Fig. 3.: 400 training data, 50 learning epochs.

    Fig. 4.: 400 training data, 100 learning epochs. Fig. 4.: 400 training data, 500 learning epochs.

  • 8/7/2019 17 EnachescuMultidimension VI 17-1-6

    6/6

    VI-17-6

    400 training dataNumber

    of

    epochsLearning Error Generalization Error

    10 0.1039422 0.1048457

    50 0.0223206 0.0214215

    100 0.0127233 0.0127233

    500 0.0022711 0.0020923

    1000 0.0016508 0.0016481

    Table1: Results of the simulations, describing number of epochs, learning error and

    generalization error.

    VI. CONCLUSIONS

    Experiments described in this chapter demonstrate that RBF neural networks can be

    successfully used for multidimensional function approximation.

    REFERENCES[1] Broomhead D.S., Lowe D., (1988), Multivariable functional interpolation and adaptive

    networks, Complex Systems 2, 321-355.

    [2] Enchescu, C. (1995), Properties of Neural Networks Learning, 5th International

    Symposium on Automatic Control and Computer Science, SACCS'95, Vol. 2, 273-278,

    Technical University "Gh. Asachi" of Iasi, Romania.

    [3] Enchescu, C. (1996), Neural Networks as approximation methods. International

    Conference on Approximation and Optimisation Methods, ICAOR'96, " Babes-Bolyai

    University ", Vol. 2., 83-92, Cluj-Napoca.

    [4] Enchescu, C., (1995), Learning the Neural Networks from the Approximation Theory

    Perspective. Intelligent Computer Communication ICC'95 Proceedings, 184-187, Technical

    University of Cluj-Napoca, Romania.[5] Enchescu, C. (1998), The Theoretical fundamentals of Neural Computing, Casa Crii de

    tiin, Cluj-Napoca. (in Romanian).

    [6] Girosi, F., T. Pogio (1990), Networks and the Best Approximation Property. Biological

    Cybernetics, 63, 169-176.

    [7] Haykin, S., (1994), Neural Networks: A Comprehensive Foundation, Macmillan College

    Publishing Company, New York, NY.

    [8] Moody J., Darken C., (1989), Fast learning in networks of locally tuned processing units,

    Neural Computation, 1, 281-294.

    [9] Poggio T., Girosi F., (1990), Networks for approximation and learning, Proceedings of the

    IEEE 78, 1481-1497.

    [10] Powell M.J.D., (1988), Radial basis function approximations to polynomials, Numerical

    Analysis 1987 Proceedings, 233-241.