neural networks in robotics a survey

Upload: guilherme-nolasco

Post on 14-Oct-2015

56 views

Category:

Documents


0 download

TRANSCRIPT

  • Journal of Intelligent and Robotic Systems 3: 51-66, 1990. 51 1990 Kluwer Academic Publishers. Printed in the Netherlands.

    Neural Networks in Robotics: A Survey*

    BILL HORNE, M. JAMSHIDI and NADER VADIEE CAD Laboratory for Systems~Robotics, Department of Electrical and Computer Engineering, University of New Mexico, Albuquerque, NM 87131, U.S.A.

    (Received: 12 December 1988; revised: 5 March 1989)

    Abstract. The purpose of this paper is to provide an overview of the research being done in neural network approaches to robotics, outline the strengths and weaknesses of current approaches, and predict future trends in this area.

    Key words. Neural networks, robotics, nonlinear control.

    1. Introduction

    Neural networking has become one of the most popular topics in the scientific community within the past few years, marked by record conference attendances and publications. Many disciplines in science have been affected by this 'mini-revolution', and robotics is no exception. This is illustrated in the fact that eighty-five percent of the papers reviewed in this article have been published within the past three years. The papers reviewed in this paper primarily reflect approaches to robotics which may have near term application to industrial robotics. Most of this research is not committed to explaining physiological aspects of human motor control. However, there is another large body of publication which is committed to this goal (for example, see Bullock and Grossberg [7]).

    This paper is structured as follows. Section 2 provides a brief overview of neural networks, some of the criticisms which arise concerning the role of neural networks in robotics, and a description of the broom-balancing problem as a historical per- spective. Section 3 provides a brief outline of robotics. Section 4 summarizes specific research in neural network approaches to robotics. Section 5 provides a summary and critical analysis of the approaches outlined in this paper.

    2. An Overview of Neural Networks

    Neural networks are systems which have been derived through models of neurophysi- ology. In general, they consist of a collection of simple nonlinear computing elements whose inputs and outputs are tied together to form a network. The main advantages of neural networks for robotics is their ability to adaptively learn nonlinear functions whose analytic forms are difficult to derive and whose solutions are hard to compute.

    * This work was supported, in part, by Sandia National Laboratories under contract No. 06-1977, Albuquerque, New Mexico.

  • 52 BILL HORNE ET AL.

    The most dominant forms of neural networks used in robotics are the multi-layer perceptron and the Hopfield network. We will give a brief description below of the mechanics of these networks for those who are not familiar with them. However, many other networks have been proposed for robotics including Competitive and Cooperative nets [4], Reward/Punishment nets [6], etc. Where appropriate, sources for more information on these networks have been referenced.

    2.1. THE MULT I -LAYER PERCEPTRON

    The basic computational element in a multi-layer perceptron (MLP) is the perceptron [37], shown in Figure 1. A perceptron receives N continuously valued inputs, x~, i = 1, 2, . . . , n, from either an external source (e.g. a sensor) or other perceptrons. The inputs are multiplied by a scalar value which is then added to a bias weight to form the intermediate value y. A nonlinear function is then applied to y to form the output of the perceptron, u.

    Some typical nonlinear functions are shown in Figure 2. One of these functions is a hard-limiter. When this function is applied to the weighted sum, the perceptron effectively forms a linear decision region in the input space. If perceptrons are layered as shown in Figure 3, then more complex decision regions can be formed. Multi- layered networks can form arbitrary decision regions including non-convex and multi-modal regions [17, 24]. We would like the MLP to achieve these decision regions adaptively by example, unfortunately there is no learning algorithm for MLPs when hard-limiters are used. Instead, a function is required which is mono- tonically increasing and differentiable. Such a function is a sigmoid which is also shown in Figure 2. The resulting weight update is formed by taking the gradient of the total squared error (squared difference between actual output and desired output for a training sample) with respect to the weights and performing a gradient search of the weight space. The resulting learning algorithm is called back-propagation because the errors are propagated backwards through the network. The equations are given by [38]:

    wij,k(n + 1) = wij,k(n) + pei, kui, k(1 -- ui, k)ui Ij

    X 1

    2

    W W o

    O

    Fig. 1. A perceptron.

  • NEURAL NETWORKS IN ROBOTICS: A SURVEY 53

    1.0

    0.8

    0.6

    ~4~ 0.4

    0.2

    0.0

    / 9~-hard limit~r

    I

    -2 0 2

    X

    Fig. 2. Typical non-linearities.

    OUTPUTS

    INPUTS Fig. 3. A multi-layered perceptron.

    where

    Nj ei. k = y~ ui+,.,,(l - U,+,.m)ei+l,,,,Wi+l,k ....

    m - I

    ei,k is the error term for kth node in ith layer, Ui,k the output of kth node in ith layer, w

  • 54 BILL HORNE ET AL,

    to form a continuous nonlinear mapping. Note here that classification is considered just a special case of a nonlinear mapping. This is the property most useful for robotics since kinematics (forward and inverse), dynamics, sensing and control can all be described in terms of nonlinear mappings.

    There are a number of problems with using MLPs. First, there is no guarantee that an MLP will converge to a local minimum within the weight space since the gradient used in the weight update is the gradient for a given training sample and may not represent the overall gradient. A technique called batching can be used to compute an error based on many samples thus giving a more realistic estimate of the gradient. However in practice we have found that this technique does not provide any signifi- cant improvement over using an instantaneous estimate of the gradient. Second, the MLP implements only an approximation to an actual nonlinear mapping. The accuracy of this approximation may be questionable. Third, there is no known method for determining the number of nodes, convergence parameters, etc., for a given problem. Finally, there is no known way to 'preprogram' a priori knowledge into the network. As a result the MLP starts in a completely random state and may require a long training period to converge.

    2.2. THE HOPFIELD NETWORK

    The Hopfield network [44] has many variations, the simplest of which is discussed here. Hopfield networks are used for associative memory and combinatorial opti- mization. The latter property has been the most frequently used for robotics applica- tions. The Hopfield network is a one-layer network with feedback connected in a crossbar topology as shown in Figure 4. Each node uses a hard-limiting nonlinearity (variations use other nonlinearities). The equation of the output is given by:

    u(k + 1) = f (Tu(k ) + x).

    An energy function for the network can be defined as:

    E - - lb lTTb l - - uTx , ,

    U. U 2 o oo U

    X I

    2 e o

    X n

    " - |

    t~n|

    ~2

    t'~n2

    O---

    o.o- 0 0

    Fig. 4. The Hopfield network.

  • NEURAL NETWORKS IN ROBOTICS: A SURVEY 55

    where u is the output vector of network, x the input vector of network, T the matrix of interconnection weights, f the nonlinearity (e.g. hard-limiter), and E the energy.

    Conceptually, the output update equation serves to perform a minimization of the energy function.* This property can be used for solving combinatorial optimization problems [44, 52]. An example is the travelling salesman problem in which a salesman must visit N cities and minimize the distance travelled over the entire tour. The only information given is the distance between pairs of cities. To solve this problem a~ energy function is defined so that the minimization of this function corresponds to a solution of the problem.

    An advantage of the Hopfield network is that there exists a proof of convergence based on Lyapunov analysis [44, 52] and, furthermore, the convergence rate is independent of the number of nodes. However, even though the network is guaranteed to converge, it often converges to local minima which are undesirable. For example, in the travelling salesman problem, solutions are often reached which do not satisfy constraints of the problem, e.g. the salesman does not visit every city. In the travelling salesman problem the energy function can be easily defined since the solution only needs to meet a limited number of constraints. In more difficult problems with large numbers of constraints it seems likely that the network will be even more likely to arrive at invalid solutions.

    2.3. A HISTORICAL PERSPECTIVE: THE 'BROOM-BALANCING' PROBLEM

    The broom-balancing problem (also called the inverted pendulum or cart-pole system) is a classic example of the application of neural networks to control. The broom-balancing problem consists of an inverted pendulum of length L and mass m mounted on a cart of mass M, as shown in Figure 5. The goal of the controller is to keep the pole balanced (i.e. the angle 0 = 0) and maintain the cart at its origin (i.e. X = 0) by appling a force, F, in the horizontal direction.

    The equations of motion governing the system are:

    3 0 = 4-~ (g sin 0 - ](cos 0)

    and

    m(L sin 002 - -~g sin 20) - FX + u 2=

    M + m(1 - cos 2 0)

    This is an undamped and inherently unstable fourth-order system whose dynamics are nonlinear and coupled. The dynamics of many control problems including robotic control are, in general, nonlinear and coupled. Therefore the broom-balancing pro- blem is often considered a proof of principle of neural networks in robotics.

    * Note that although this energy function has a quadratic form, u has a non-linear form (typically non-polynomial) and thus the energy function can have multiple minima.

  • 56 BILL HORNE ET AL.

    04

    e

    i

    v

    V l

    X

    Fig. 5. The cart-pole system.

    Originally, the broom-balancing problem was solved by Widrow and Smith [42, 50, 51]. Not only was this the first application of neural networks to control, but it was one of the first applications of neural networks to any problem. Widrow and Smith used 'bang-bang' control (i.e. the controller can apply an impulsive force F of fixed magnitude in either the positive or negative X direction) with a single adaptive linear element (ADALINE). This device was trained by observing a human teacher operate the system manually. Furthermore, Widrow and Smith physically implemented the system as opposed to simulation. More recently Tolat and Widrow [45] have simulated this system using visual inputs only. There are two issues to consider about this approach: (i) bang-bang control is essentially a classification problem, and (ii) the network learned from a human teacher. Both of these conditions will not, in general, extend to robotic control.

    Guez and Selinsky [11, 12] have applied MLPs to this problem as a simulation. In doing so the control forces applied to the cart are continuous values. One interesting outcome of this research is that it was observed that the perform- ance of the network was much better than that of the human teacher. This improvement is due to the fact that human control is often inconsistent and that there are physiological limitations in humans which the neural network can overcome.

    Barto et al. [6] and Anderson [5] have simulated reinforcement learning for the broom-balancing problem. Like Widrow and Smith these systems use bang-bang control. Here the state space is quantized using a decoder. A neuron-like element called an Adaptive Search Element (ASE) performs the classification. The reinforce- ment is aided by an Adaptive Critic Element (ACE). The purpose of the ACE is to provide an element of prediction as to what the reinforcement should be. A signal,

  • NEURAL NETWORKS IN ROBOTICS: A SURVEY 57

    r, is supplied to the network which reinforces network weights if the pole remains balanced and "punishes" weights which result in the pole falling past some nominal value of 0.

    An important issue is that the work of Widrow, Barto, and colleagues requires quantization of the input space with a decoder and thus resembles table look-up methods. For the broom-balancing problem the level of granularity can be fairly coarse and still yield good performance. However, in general most problems will require finer quantization. A study of Raibert [35] indicates that this approach is, in general, impractical for robotic control. The multi-layer perceptron approach used by Guez and Selinsky incorporates the decoding functionality implicitly in the network and thus avoids this problem. One way of thinking about this approach is that a three-layered perceptron can account for the functionality of a decoder and a single classification element combined.

    3. An Overview of Robotics

    Robotics constitutes the study of a finite number of rigid mechanical chains which represent a multi-variable non-linear coupled system. The solution of this problem is difficult because even the simplest desired movement requires sophisticated and computationally intensive mathematics. Major problem areas of robotics include kinematics, dynamics, trajectory planning, sensing, control, computer languages and intelligence.

    3.1. K INEMATICS

    Kinematics refers to the study of robot joint motions without considering the causes of motion. Two distinct subproblems are distinguished here. The forward kinematics problem involves a non-linear matrix mapping from joint space (i.e. a description of the robot in terms of joint angles/positions) of the robot to Cartesian space (i.e. Cartesian coordinates of the robot end-effector). The forward kinematic solution computes the Cartesian location of the end-effector given the joint space descrip- tion. The computation is relatively straight forward but requires several non-linear trigonometric and matrix multiplications. To do this independent coordinate systems are associated with each link of the robot. A point, p~, expressed with respect to link i can be expressed with respect to the coordinate system in link i - 1 by the equation

    Pi = i - lA iP i ,

    where i-~A i is the Denavit-Hartenberg transformation matrix. The inverse problem is, on the other hand, a somewhat more difficult task. The inverse kinematic solution computes the joint space coordinates given the end-effector location in Cartesian

  • 58 BILL HORNE ET AL.

    coordinates. This problem can yield a multiple number of solutions. In general, the inverse kinematic solution is more computationally intensive than the forward kinematic solution.

    3.2. DYNAMICS

    Robot arm dynamics formulates the mapping between joint torques applied to the robot and the joint coordinates, velocities and accelerations. Robot arm dynamics is difficult because it requires incorporating effects of inertia, coupling between joints (Coriolis and centrifugal), gravity loading, and potentially backlash, gear friction, and the dynamics of the control devices. Solutions such as the Lagrange-Euler (L-E) formulation and the Newton-Euler (N-E) formulation require large numbers of trigonometric and nonlinear functions of the joint coordinates, velocities and accelerations. For example, the Lagrange-Euler equation for a simple two link manipulator with rotary joints, and equal length links, is found to be

    Zl = 3 , + ~m2 I2 + m2C2 I2 m2 I2 + m212C2 "01

    lm 12 m212C2 lm212 "02 "E2 ~ 2* +

    where z i is the torque applied to joint i to drive link i, mi is the mass of the ith link, l the length of each link (assumed equal), g the gravitational constant, C~ the cos (0~), S~ the sin (Oi), and C,s the cos (Oi + Oj).

    3.3. TRAJECTORY PLANNING

    Trajectory planning is the process of defining a desired path in joint or Cartesian space for joints or gripper, respectively. Generally this requires computing a set of poly- nomial functions to generate a sequence of desired reference points. The problem is further complicated when the robot must move around obstacles or is constrained to a given path.

    3.4. SENSING

    The use of external sensing mechanisms allows the robot to interact with its environ- ment in a more efficient manner. Types of sensors include position, velocity, accelera- tion, range, proximity, force, tactile, and visual. However, the performance of sensing systems is relatively primitive. Most industrial robots use minimal sensory feedback.

    3.5. CONTROL

    Given a dynamical description of a robot the purpose of the control module is to maintain the dynamic response of the manipulator in accordance with some

  • NEURAL NETWORKS IN ROBOTICS: A SURVEY 59

    prespecified desired trajectory. In general control is difficult because the dynamics of the links in a robot arm are nonlinear and coupled. Control techniques include computed torque control [26], and resolved motion rate control [48], A major problem is controlling a robot which handles large payloads or works in contact with the environment since the dynamics of the robot changes as force is imposed by the environment. Control techniques to deal with this problem include damping control [49], hybrid position/force control [36], and impedance control [16].

    3.6. TASK PLANNING AND INTELLIGENCE

    Finally, other important areas in robotics are task planning and intelligence. It is desirable to give the robot a high-level task to perform and have the robot develop an intelligent plan to perform the task. These problems involve the coordination of multiple sources of data and require problem-solving techniques. Also more flexible robotics systems will have to be developed to be able to operate in a changing and harsh environment. When robots are inflexible a great deal of time must be invested in setting up a friendly work space and environment. Robots will have to be equipped with some sort of intelligence to deal with uncertainties in the environment.

    4. Summary of Neural Network Approaches to Robotics

    4.1. NEURAL NETWORKS IN K INEMATICS

    The inverse kinematics (IK) problem, as discussed before, is a difficult one because it is computationally intensive and has multiple solutions. Neural networks may be able to reduce the computational complexity of the inverse kinematics problem. However, analytic [K solutions yield numerically accurate results whereas the neural network solution in general may not.

    Iberall [18, 19] has used cooperative/competitive neural networks [4] to compute the inverse kinematic solution for fingers in a simulated robot hand during grasping. Guez and Ahmad [10] have investigated inverse kinematic solutions for two- and three-degree-of-freedom manipulators using MLPs. Because of the relative numerical accuracy of neural networks compared with closed form solutions, Guez and Ahmad suggest that neural networks may be best as a good initial estimate for a manipulator that requires iterative methods for its solution.

    4.2. NEURAL NETWORKS IN DYNAMICS

    The nonlinear mapping property of neural networks is ideal for robot dynamics. The basic idea is that the neural network learns the inverse dynamical relationship of the robot directly which can be used as a inverse dynamics controller. Miyamoto et al. [30] and Kawato et al. [21, 22] have implemented a novel neural network for inverse dynamics control based on neurophysiology. They basically used a single perceptron

  • 60 BILL HORNE ET AL.

    which receives values from multiple fixed non-linearities of the input variables. The authors constrain the system by requiring a priori knowledge of the fo rm of the inverse dynamics formulation (i.e. Lagrange-Euler). This property may be useful in esti- mating coefficients of the L-E equations of an unmodelled robot. In this sense, the authors' approach resembles system identification techniques. Their approach is inherently parallel and adaptive. The authors have implemented the network on a Hewlett-Packard 9600-300-320 for a PUMA 260 manipulator.

    One issue is how the robot learns the inverse dynamics mapping. Psaltis et al. [33, 34] provide an excellent overview of different learning techniques. One problem they discuss is how to effectively train the neural network on-line without using a random control regime. An interesting proposal involves back-propagating errors through the robot to update the neural net. However, this requires some a priori estimate of the dynamics of the robot and the solution will have errors related to the error in this estimate.

    4.3. NEURAL NETWORKS IN TRAJECTORY PLANNING

    Jorgenson [20] has investigated the use of simulated annealing [38] for mobile robot path planning. However, he notes that simulated annealing is too computationally intensive for real-time applications such as robotics. Obstacle avoidance in an unknown environment has been implemented by Tsutsumi et al. [46, 47] using Hopfield nets for multi-joint robots and truss structures. Here, whenever a new obstacle is found, the weights are updated by adding a term to the energy function. The energy function used to describe the network weights is quite complex and may be difficult to implement in practice. Seshadri [39] has investigated the use of Hopfield nets for mobile robot path planning. His formulation resembles the travelling sales- man problem in that the purpose of the neural network is to minimize the length of a path to a goal position. Here the system is constrained when obstacles are in the work space. Although Seshadri does not give the energy functions, it seems his approach would be simpler than that of Tsutsumi et al. described above. However, the traditional formulation of the travelling salesman problem is equally effective for this task, and since the weights of the network are fixed, the system cannot deal with unanticipated obstacles.

    Liu et al. [25] have used a MLP for classifying various robotic hand grips based on characteristics of the object to be grasped. This work is an excellent example of how classification properties of neural networks can be useful in robotics.

    Eckmiller [8] has developed a novel neural network called a neural triangular lattice (NTL) for storing and retrieving trajectories.

    4.4. NEURAL NETWORKS IN SENSING

    Although the use of neural networks in vision is quite extensive, a discussion of this research is beyond the scope of this paper. The only other work we have come across

  • NEURAL NETWORKS IN ROBOTICS: A SURVEY 61

    in sensing is that of Pati et al. [32] who investigate the use of neural networks in tactile perception. The basic inversion problem of tactile perception is a deconvolution to recover the surface stress from strain data. Pati use the deconvolution properties of a Hopfield net [27] to perform the mapping.

    4.5. NEURAL NETWORKS IN CONTROL

    The bulk of neural network research for robotics has been done in control. Inverse dynamics control was discussed in Section 4.2. Other approaches are diverse and use many different types of networks.

    Pao and Sobajic [31] and Sobajic et al. [43] implemented a positional control algorithm for a two-degree-of-freedom manipulator using an iterative IK solution. The authors have implemented the system using a MLP running on an IBM PC/AT to control an Intelledex 605T manipulator.

    Albus [1, 2, 3] presents his Cerebellar Model Articulation Controller (CMAC) for general robotic control. This approach can be described as a distributed table-lookup method. Most of his papers are concerned with the properties of CMAC and do not discuss control issues at any great length. Miller [29] has incorporated CMAC into a computed torque controller and has attained impressive simulation results.

    Guez et al. [13, 14, 15] simulate Hopfield networks in Model Reference Adapt- ive Control (MRAC). Here the network adjusted the parameters of the MRAC controller as opposed to providing a complete input/output mapping of a controller.

    Elsley [9] has simulated an inverse Jacobian controller using MLPs. Elsley claims better performance on long movements than a traditional inverse Jacobian controller typically provides. Shepanski and Macy [40, 41] simulate automobile driving using MLPs. They discuss the relationship between their approach and an expert system approach.

    Kuperstein [23] uses a novel neural network called INFANT (Interacting Networks Functioning on Adaptive Neural Topologies) for positional control using visual inputs. This system is highly motivated by neurological and developmental data. Unfortunately Kuperstein does not give much detail on the pure robotics aspects of the system.

    4.6. NEURAL NETWORKS IN TASK PLANNING AND INTELLIGENCE

    Albus [2] has proposed that intelligent control could be implemented with a hierarchy of CMAC modules. At each successive level of the CMAC hierarchy, a CMAC module would decompose a high level command into a set of lower level commands. A command such as 'Build a widget' could be given to the CMAC hierarchy which would decompose the solution into successively simpler commands at each level until

  • 62 BILL HORNE ET AL.

    the lowest level CMAC finally computes the required actuator torques. Other work related to intelligent robotics includes a proposal for controlling multiple robots simultaneously [53], for industrial manufacturing.

    5. Summary and Analysis

    5.1. CRITICISMS AND ISSUES CONCERNING NEURAL NETWORKS IN ROBOTICS

    This section focuses on some of the criticisms which face neural network control paradigms. Much of the research discussed throughout this paper addresses these issues directly.

    Neural networks are primarily classification networks. Many people feel that neural networks (primarily MLPs) are best applied to control problems where there is some type of classification being performed. It is certainly true that some very interesting control problems have been implemented using the classification abilities of MLPs. For example, they can be used in relay or bang-bang control systems. However, as we have discussed classification in these networks is just one type of non-linear mapping. The most promising property of neural networks is their ability to adaptively learn complex mappings. One superficial advantage of this property is that it allows us to avoid deriving some closed form analytic function by hand. But more importantly, the system could learn mappings which are mathematically intractable. In addition, the system would be portable since it adapts to the robot (or environment) to which it is applied.

    Neural networks must learn from a teacher. The criticism here is that neural net- works must learn to imitate some type of conventional controller. But this is of little use since the conventional controller is already available and the neural network cannot improve upon its performance. However, many networks learn from a con- troller which has no analytic model, e.g. a human teacher. Perhaps the most powerful property of neural networks is their ability to model the controlled system itself. In robotics, for example, it may be possible to implement a complete inverse dynamics model of the robot which could possibly incorporate dynamics of the control device, backlash, and gear friction. This model would be computed without the need for analytic modelling.

    Although some work has been done with reinforcement learning [5, 6], it is not clear how to apply these principles to general robotic control.

    Speed considerations: Simulation vs Implementation. Most neural networks cannot run in real time. Although neural networks are inherently parallel no hardware exists to implement them in parallel. They do not map nicely into existing digital parallel architectures such as hypercubes, transputers, or the Connection Machine. This is because neural networks require simple processing but complex intercommunication requirements. In most existing parallel architectures, interprocess communication tends to be biggest bottleneck. Optical implementations which will run in real time are several years away.

  • NEURAL NETWORKS IN ROBOTICS: A SURVEY 63

    Throughout the paper we have been careful to describe whether the research we have reviewed is merely a proposal, a simulation, or an implementation.

    On-line vs off-line learning. Related to the above issue is the question of whether a neural network should be taught on-line or off-line. On-line learning is desirable because the network would potentially be able to adjust to changes in the system. However, due to speed considerations, it may be necessary to train the network off-line and use it as a non-adaptive system.

    Comparisons of neural networks and conventional systems. Perhaps the most important thing to consider is whether a neural network approach to robotics can do something which a conventional approach cannot. This is especially important when considering that a neural network solution may often be numerically inaccurate compared to a conventional solution. Researchers should either approach problems which have no conventional solution or give a comparison with conventional techni- ques. None of the researches outlined in this paper offered such comparisons.

    5.2. PROBLEM AREAS IN NEURAL NETWORK APPROACHES TO ROBOTICS

    Problem areas for research in neural network approaches to robotics can be divided into four categories:

    (a) Specifying the configuration of the network. Decisions have to be made regard- ing the type of neural network to be used, its architecture, topology, number of layers, number of nodes, type of non-linearity, and associated parameters. It is not adequate to simply throw a large number of resources at the problem. For example, in an MLP it might seem possible to use more nodes than are needed for a given problem. However, it has been observed that too many nodes results in over-generalization on the training set and poor performance on new data.

    (b) Specifying the teacher. If supervised learning is adopted, the type of teacher or expert that is available to give the correct output to the net has to be chosen. As previously discussed, using a conventional controller as the teacher is useless since the controller already exists and will outperform the neural net. It is not clear how unsupervised learning techniques can realize complex non-linear mappings.

    (c) Specifying the training set. The training set must be specified to give an adequate representation of the type of inputs the network is likely to see. We would like to be able to derive some type of optimal training set for specific applica- tion domains. Often random training is used, but this approach is inappropriate for on-line learning.

    (d) Specifying the learning algorithm. There is a great need for more efficient learning algorithms to speed training and guarantee convergence. A major problem is that as new training samples are learned, old mappings tend to degrade. This results in the network over-generalizing to recent patterns. One solution to this problem is to randomize the order of the training data. However, this can only be effective in off-line learning techniques.

  • 64 BILL HORNE ET AL.

    5.3. SUMMARY

    Research in neural network approaches to robotics are quite diverse and rudimentary, leaving much room for improvement and new areas for development. Much of this work yields poor performance relative to conventional techniques partially because neural networks are not fully understood or developed. For example, it is quite difficult to get an MLP to learn a simple trigonometric function much less complex dynamical formulations such as the Lagrange-Euler equations.

    Hopefully many of these problems will disappear as we develop and understand neural networks more fully. Much of the preliminary research discussed here is quite promising. For example, the fact that an MLP can implement a robotic kinematics, dynamics, or control mapping at all is quite remarkable given the convergence properties and relative level of understanding of this network.

    Obviously, the research outlined in this paper should not be taken as the final word in neural network approaches to robotics. Instead most of this work should be considered as proof-of-principle that neural networks can solve some interesting problems in robotics. In the future, as neural networks are improved and better understood, the solutions to the problems addresses by this paper will be much more competitive with conventional approaches.

    Acknowledgements

    The authors would like, gratefully, to thank Sandia National Laboratories for sup- porting this survey paper. We would also like to thank Rebecca Hogenauer for her valuable comments on the original draft of this manuscript.

    The work we have reviewed here is a representative sample of the work being done in neural network approaches to robotics. If we have missed any particular publication, we would appreciate receiving a copy for any further editions of this paper.

    References

    I. Albus, J., A new approach to manipulator control: the Cerebellar Model Articulation Controller (CMAC), J. Dynamic Systems, Measurement, and Control, 220-227 (Sept. 1975).

    2. Albus, J., Data storage in the Cerebellar Model Articulation Controller (CMAC), J. Dynamic Systems, Measurement, and Control, 228-233 (Sept. 1975).

    3. Albus, J., Mechanisms of planning and problem solving in the Brain, Mathematical Biosciences 45, 247-293 (1979).

    4. Amari, S. and Arbib, M., Competition and cooperation in neural nets, In Systems Neuroscience, J. Metzler (ed.), Academic Press, New York, pp. 119-165 (1977).

    5. Anderson, C., Learning to control an inverted pendulum with connectionist networks, IEEE Conf. on Decision and Control, pp. 2294-2298 (1987).

    6. Barto, A. et al., Neuronlike adaptive elements that can solve difficult learning control problems, IEEE Trans. Systems, Man, and Cybernetics, Vol. 13, pp. 834-846 (1983).

    7. Bullock, D. and Grossberg, S., Neural dynamics of planned arm movements: emergent invariants and speed-accuracy properties during trajectory formation, Psychological Review, Vol. 95, No. 1, pp. 49-90 (1988).

  • NEURAL NETWORKS IN ROBOTICS: A SURVEY 65

    8. Eckmiller, R., Neural network mechanisms for generation and learning of motor programs, IEEE Conf. on Neural Networks, Vol. 4, pp. 545-550 (1987).

    9. Elsley, R., A learning architecture for control based on back-propagation neural networks, IEEE Conf. on Neural Networks, Vol. II, pp. 584-587 (1988).

    10. Guez, A. and Ahmad, Z, Solution to the inverse kinematics problem in robotics by neural networks, IEEE Conf. on Neural Networks, Vol. II, pp. 617-624 (1988).

    11. Guez, A. and Selinsky, J., A trainable neuromorphic controller, J. Robotic Systems 5(4), 363-388 (1988). 12. Guez, A. and Selinsky, J., A neuromorphic controller with a human teacher, IEEE Conf. on Neural

    Networks, Vol. II, pp. 595-602 (1988). 13. Guez, A. et al., Neuromorphic architecture for adaptive robot control: a preliminary analysis, IEEE

    Conf. on Neural Networks, Vol. 4, pp. 567-572 (1987). 14. Guez, A. et al., Neuromorphic architectures for fast adaptive robot control, IEEE Conf. on Robotics

    and Automation, pp, 145-149 (1988). 15. Guez, A. et al., Neural network architecture for control, IEEE Control Systems Magazine, Vol. 8,

    No. 2, pp. 22-25 (April i988). 16. Hogan, N., Impedance control: an approach to manipulation, J. Dynamic Systems, Measurement, and

    Control 107, 1-24 (March 1985). 17. Horne, W. - In an unpublished result we have demonstrated that it is possible to implement a limited

    class of multi-modal and non-convex decision regions using a two-layered network, thus countering Lippmann's arguments. For more information contact the author.

    18. Iberall, T., A ballpark approach to modelling human prehension, 1EEE Conf. on Neural Networks, Vol. 4, pp. 535-544 (1987).

    19. Iberall, T., A neural network for planning hand shapes in human prehension, IEEE Conf. on Decision and Control, pp. 2288-2293 (1987).

    20. Jorgenson, C.C., Neural network representation of sensor graphs in autonomous robot path planning, IEEE Conf. on Neural Networks, Vol. 4, pp. 507-516 (1987).

    21. Kawato, M. et al., A hierarchical model for voluntary movement and its application to robotics, IEEE Conf. on Neural Networks, Vol. 4, pp. 573-582 (1987).

    22. Kawato, M. et al., Hierarchical neural network model for voluntary movement with application to robotics, IEEE Control Systems Magazine, Vol. 8, No. 2, pp. 8-16 (April 1988).

    23. Kuperstein, M., Generalized neural model for adaptive sensory-motor control of single postures, IEEE Conf. on Robotics and Automation, pp. 140-144 (1988).

    24. Lippmann, R., An introduction to computing with neural nets, I EEEASSP Magazine, pp. 4-22 (April 1987).

    25. Liu, H. et al., Building a generic architecture for robot hand control, IEEE Conf. on Neural Networks, Vol. II, pp. 567-574 (1988).

    26. Markiewicz, B., Analysis of the computed torque drive method and comparison with conventional position servo for a computer-controlled manipulator, Technical Memo 33-601, Jet Propulsion Lab- oratory, Pasadena, CA (1973).

    27. Marrian, C. and peckerar, M., Electronic neural net algorithm for maximum entropy deconvolution, IEEE Conf. on Neural Networks, Vol. III, pp. 749-758 (1987).

    28. Mel, B., MURPHY: a robot that learns by doing, Proc. AlP Neural Networks Conf., Denver, pp. 544-553 (Nov. 1987).

    29. Miller, W.T. et al., Application of a general learning algorithm to the control of robotic manipulators, Int. J. Robotics Research, 6, No. 2, 84-98 (1987).

    30. Miyamoto, H. et al., Feedback-error-learning neural network for trajectory control of a robotic manipulator, Neural Networks, I, No. 3, 251-265 (1988).

    31. Pao, Y. and Sobajic, D., Artificial neural-net based intelligent robotics control, Proc. SPIE-Intelligent Robots and Computer Vision, Vol. 848, pp. 542-549 (1987).

    32. Pati, Y. et al., Neural networks for tactile perception, IEEE Conf. on Robotics and Automation, pp. 134-139 (1988).

    33. Psaltis, D. et al., Neural controllers, IEEE Conf. on Neural Networks, Vol. 4, pp. 551-558 (1987). 34. Psaltis, D. et al., A multitayered neural network controller, IEEE Control Systems Magazine, Vol. 8,

    No. 2, pp. 17-21 (1988). 35. Raibert, M., Analytical equations vs table look-up for manipulation: a unifying concept, IEEE Conf.

    on Decision and Control, pp. 576-579 (1977).

  • 66 BILL HORNE ET AL.

    36. Raibert, M. and Craig, J., Hybrid position/force control of manipulators, J. Dynamic Systems, Measurement, and Control, 102, 126-133 (June 1981).

    37. Rosenblatt, F., Principles of Neuro4vnamics, Spartan, New York (1962). 38. Rumelhart, D. et al., Parallel Distributed Processing: Explorations in the Microstructure of Cognition,

    MIT Press, Cambridge (1986). 39. Seshadri, V., A neural network architecture for robot path planning, Proc. Second International Symp.

    on Robotics and Manufacturing: Research, Foundation, and Applications, ASME Press, pp. 249-256 (1988).

    40. Shepanski, J. and Macy, S., Teaching artificial neural systems to drive: manual training techniques for autonomous systems, Proc. SPIE-Intelligent Robots and Computer Vision, Vol. 848, pp. 286-293 (1987).

    41. Shepanski, J. and Macy, S., Teaching artificial neural systems to drive: manual training techniques for autonomous systems, Proc. AIP Neural Networks Conf., Denver, pp. 693-700 (Nov. 1987).

    42. Smith, F., A trainable nonlinear function generator, IEEE Trans. Automatic Control AC-11, No. 2, 212-218 (1966).

    43. Sobajic, D. et al., Intelligent control of the Intelledex 605T robot manipulator, IEEE Int. Conf. on Neural Networks, Vol. II, pp. 633-640 (1988).

    44. Tank, D. and Hopfield, J., 'Neural' computation of decisions in optimization problems, Biological Cybernetics 52, 141-152 (1985).

    45. Tolat, V. and Widrow, B., An adaptive 'broom balancer' with visual inputs, IEEEInt. Conf. on Neural Networks, Vol. II, pp. 641-647 (1988).

    46. Tsutsumi, K. and Matsumoto, H., Neural computation and learning strategy for manipulator position control, IEEE Conf. on Neural Networks, Vol. 4, pp. 525-534 (1987).

    47. Tsutsumi, K. et al., Neural computation for controlling the configuration of 2-dimensional truss structure, IEEE Conf. on Neural Networks, Vol. II, pp. 575-586 (1988).

    48. Whitney, D., Resolved motion rate control of manipulators and human prostheses, IEEE Trans. Man-Machine Systems MMS-10, No. 2, 47-53 (1969).

    49. Whitney, D., Force feedback control of manipulator fine motions, J. Dynamic Systems, Measurement, and Control, pp. 91-97 (June 1977).

    50. Widrow, B. and Smith, F., Pattern recognizing control systems, Computer and Information Sciences (COINS) Symposium Proc., Spartan Books, Wash. DC (1963).

    51. Widrow, B., The original adaptive neural net broom-balancer, IEEE Conf. on Circuits and Systems, pp. 351-357 (1987).

    52. Wilson, G. and Pawley, G., On the stability of the travelling salesman problem of Hopfield and Tank, Biological Cybernetics 58, 63-70 (1988).

    53. Yueng, D. and Bekey, G., Adaptive load balancing between mobile robots through learning in an artificial neural system, IEEE Conf. on Decision and Control, pp. 2299-2304 (1987).