experience inclusion in iterative learning controllers: fuzzy model based approaches

ARTICLE IN PRESS

0952-1976/$ - se

doi:10.1016/j.en

�CorrespondE-mail addr

[email protected]

Engineering Applications of Artificial Intelligence 21 (2008) 578–590

www.elsevier.com/locate/engappai

Brief paper

Experience inclusion in iterative learning controllers:Fuzzy model based approaches

S. Gopinath�, I.N. Kar, R.K.P. Bhatt

Department of Electrical Engineering, IIT Delhi, New Delhi 110 016, India

Received 25 February 2006; received in revised form 4 April 2007; accepted 15 May 2007

Available online 16 July 2007

Abstract

Iterative learning control (ILC) is a new domain in control system that motivates, whether mechanical robots can learn a prescribed

ideal motion by themselves using information represented by the measured data gathered in the previous practice. For each new desired

trajectory task, the conventional ILC methods have to start its learning with zero initial input assumption. Instead of such zero initial

input assumption, in this paper, the idea of using the past trajectory tracking experiences on the initial input selection for tracking new

trajectory-tracking tasks have been highlighted. Certain methods of experience inclusion in iterative learning controllers (ILC) are

proposed. Approximate fuzzy data model (AFDM) and type-1 fuzzy logic system (FLS) techniques have been adopted for the process of

initial input selection. Performance of the proposed fuzzy rule based model-based ILC approaches on initial error reduction and in error

convergence issues are proved. Numerical experimentation on a two-link manipulator model with the inclusion of actuator dynamics

verifies the performance of the proposed fuzzy model-based ILC approaches. Comparison with existing local learning technique on the

selection of initial input for ILC algorithm proves the efficacy of the proposed AFDM and type-1 FLS-based methods.

r 2007 Elsevier Ltd. All rights reserved.

Keywords: Learning control; Intelligence; Robot control; Fuzzy curves; Back propagation

1. Introduction

Iterative learning control (ILC) has become the activeresearch area for the past two decades and great effortshave been put in the development of different learningcontrollers. ILC can be considered for improving thetransient response and tracking performance of processes,machines, equipments or systems that execute the sametrajectory, motion or operation in a repetitive manner(Arimoto, 1996). Industrial robotic operations, chemicalprocesses are such situations where ILC can be used toimprove the performance. The approach is motivated bythe observation that if the system controller is fixed and ifthe system’s operating conditions are the same each time itexecutes, then any errors in the output response will berepeated during each operation. These errors can berecorded during system operation and can be used to

e front matter r 2007 Elsevier Ltd. All rights reserved.

gappai.2007.05.008

ing author. Tel.: +9111 26591093; fax: +91 11 26581606.

esses: [email protected] (S. Gopinath),

n (I.N. Kar), [email protected] (R.K.P. Bhatt).

compute modifications to the input signal that will beapplied to the system during the next operation, or trial. InILC, refinements are made to the input signal after eachtrial until the desired performance level is reached.Fig. 1 illustrates the basic configuration of ILC, where

uk(t) the input signal during the kth iteration applied to thesystem produces the output trajectory yk(t) and yd(t)denotes the desired trajectory, which the system shouldtrack. These signals are stored in the memory until thecurrent (kth) iteration is over, at which time they areprocessed offline by the ILC algorithm. The learningcontroller compares yd(t) and yk(t) and adds an updateterm with uk(t) to produce uk+1(t), the refined input signalgiven to the system for the (k+1)th iteration. The abovesignals/trajectories are functions of time defined on a finiteinterval tA[0,T] and updates occur sequentially in timeupto the required error goal is reached. D-type ILCalgorithm was introduced by Arimoto et al. (1984) , as afirst work on ILC is of the form ukþ1ðtÞ ¼ ukðtÞ þ G_ekðtÞ,where the derivative of the tracking error is defined as_ekðtÞ ¼ _ydðtÞ � _ykðtÞ with a suitable learning gain G. Bondi

www.elsevier.com/locate/engappai

dx.doi.org/10.1016/j.engappai.2007.05.008

mailto:[email protected]



ARTICLE IN PRESS

uk

SYSTEM

LEARNING

CONTROLLER

Memory

yk

yk

yd

uk+1

MemoryMemory

uk

Fig. 1. Basic iterative learning control configuration.

S. Gopinath et al. / Engineering Applications of Artificial Intelligence 21 (2008) 578–590 579

et al. (1988) set the learning controller with a sufficientlinear feedback, which is shown to ensure the trackingperformance. Further research on ILC led to the develop-ment of new algorithms and their implementation issueswith the existing conventional feedback controllers (Atke-son and McIntyre, 1986; Kuc et al., 1991; Jang et al., 1995).The capability of ILC for varying environments andmodeling errors has been explained in Moore (1998). Asan extension to the first-order systems, higher-order ILCfor a class of nonlinear dynamic systems was reported inBien and Huh (1989). It uses more historical data to get thebetter output tracking performance. Stability analysis oflearning control scheme with disturbances and uncertaininitial condition were discussed by Heinzinger et al. (1992).Some of the nonlinear robust, adaptive and model basedILC algorithms are also addressed by Xu and Tan (2003),Tayebi (2004) and Bukkems et al. (2005). Applications ofILC algorithms are applied for the robot including itsactuator dynamics and for direct drive robots are explainedin (Gopinath and Kar, 2004; Bukkems et al., 2005). Thedetailed survey of research progress in ILC area can befound in (Bristow et al., 2006).

The advantage of ILC is that it does not require theknowledge about the system dynamics and learning startsas iteration progress. This results in large initial errors inthe early stage of iterations and very slow convergence oftracking error as the iteration number (k)-N. In theabove ILC algorithms (Arimoto et al., 1984; Bondi et al.,1988; Atkeson and McIntyre, 1986; Kuc et al., 1991; Janget al., 1995; Moore, 1998; Bien and Huh, 1989; Heinzingeret al., 1992; Gopinath and Kar, 2004; Xu and Tan, 2003;Tayebi, 2004; Bukkems et al., 2005; Bristow et al., 2006 ),the controller is designed with the assumption of zerolearning. i.e., the ILC has zero initial input ðu0ðtÞ ¼ 0; 8t 2½0;T �Þ to the system for each new desired trajectorytracking. The past data of old iterations, which will notbe used in future iterations, are discarded in the case ofsingle trajectory-tracking task. If the system has to trackthe multiple desired trajectories in different time spans withdifferent magnitudes for different purposes, then the ILCalgorithm will have relatively high error or poor initialiterations performance and will require more number ofiterations each time to reach the acceptable level of

performance. Preserving the knowledge of the previousdesired trajectory tracking would be more useful in suchsituations. In conventional methods, this knowledge will belost and for each new desired trajectory tracking, ILCalgorithm will start from zero learning. If the knowledge ofthe past desired trajectory tracking could be stored as adatabase of ‘experience’, it can be used for the selection ofthe initial input for the new trajectory-tracking tasks.Direct learning of desired control input for the trajectorieswith different time scales has been explained in Xu (1998).Initial control input selection and its importance over thespeed of error convergence for time domain-based ILCalgorithms have been suggested by Arif et al. (2001). Localweighted learning techniques suggested by Atkeson et al.(1997a), are used as the tool for finding the initial controlinputs from the experience database, which contains theprevious desired output trajectories and their correspond-ing inputs. It is possible to get a better convergence of thesystem’s output to the desired trajectory with, if the initialcontrol input can be selected properly and efficiently fromthe earlier experiences of tracking several trajectories.Thus, it leads from the ILC view point, the significantinitial input selection may lead to the great reduction in theinitial iteration errors and thus the convergence of the errorwith iteration could be significantly improved as comparedto learning controller with zero learning.Fuzzy logic is widely used in intelligent control to reason

about rules, which describe the relationship betweenimprecise, linguistic assessment of the system’s input andoutput states. These production rules are generally naturallanguage representation of expert’s knowledge and providean easily understood knowledge representation scheme.Fuzzy rule-based systems could also be used as a tool formodeling nonlinear systems from the viewpoint of uni-versal approximator of functions (Wang and Mendel, 1992;Kreinovich et al., 1998). In this paper, we are exploiting thefuzzy curves technique reported in (Lin and Cunningham,1995; Azeem et al., 2003), for determining the initialcontrol input of the ILC. We propose a new idea ofpreservation of the trajectory-tracking experience as anapproximate fuzzy data model (AFDM). The key idea ofthe proposed approach is to generate a rule base of IF-THEN fuzzy rules using the experience database of desiredsystem states and their corresponding optimal controlinputs. The initial input selection for the multipletrajectory-tracking tasks by the use of AFDM-defined rulebase technique, are explained in detail in the paper. Theproposed method of experience inclusion has been shownto be better on initial error reduction and convergenceissues in comparison with zero learning. These issues havebeen proved analytically and verified through simulations.A comparative study has also been performed on thelearning controller with experience inclusion using theAFDM and the locally weighted learning methods.This paper has been organized as follows. The problem

statement is presented in Sections 2 and 3 describes theissue of experience inclusion in learning controllers and the

ARTICLE IN PRESSS. Gopinath et al. / Engineering Applications of Artificial Intelligence 21 (2008) 578–590580

use of existing local learning technique. Proposed AFDM-based ILC approach, its rule base formation and initialinput selection issues are explained in Section 4. Thissection also contains the detailed mathematical justifica-tions on initial iteration error reduction and the errorconvergence issues. Section 5 presents the detailed numer-ical experimentation carried on a two-link robot manip-ulator system with the inclusion of its actuator dynamics,thus verifies the performance of the proposed AFDMapproach in comparison with local learning method.

2. Problem statement

Consider a nonlinear time-varying dynamic systemdescribed as

_xkðtÞ ¼ fðxkðtÞ; ukðtÞÞ,

ykðtÞ ¼ gðxkðtÞ; ukðtÞÞ, ð1Þ

where k corresponds to the kth iteration of the system,xk(t)ARnx1, uk(t)ARmx1 and yk(t)ARrx1 are the states,control input and output of the system, respectively, fortA[0,T ] and fð:Þ and gð:Þ are unknown nonlinear functions.Also. the function fð:Þ satisfies the Lipschitz condition

jjfðx1; u1; tÞ � fðx2; u1; tÞjjpLf ½jjx1ðtÞ � x2ðtÞjj

þ jju1 � u2jj�, ð2Þ

where 0oLfoN is an unknown Lipschitz constant. Let thedesired trajectory be yd(t) is continuously differentiable onthe finite time interval tA[0,T], such that for bounded desiredtrajectory yd(t), there exists a unique bounded input ud(t) fortA[0, T], for which system has unique bounded states xd(t)and yd(t) ¼ g(xd(t), ud(t), t) for tA[0,T]. The trajectory-tracking problem can be stated as follows: for a desiredtrajectory yd(t), find a sequence of piecewise continuouscontrol input u(t) such that the system’s output y(t)converges to yd(t). For practical systems, only boundedconvergence is expected. In the context of ILC, this trackingproblem is solved by modifying the control input uk(t) ineach iteration k such that the control input uk(t) converges tothe desired control input ud(t). In search of this desiredcontrol input ud(t), instead of zero initial input assumption aswe do in conventional ILC algorithms, this paper addressesthe issues related to the selection of appropriate initial inputsfor the multiple trajectory tasks. In this paper, the followingnorm definitions are used for a function f(t) are stated as

jjf ðtÞjjs ¼ supt2½0;T �

e�stjjf ðtÞjj, (3)

jjf ðtÞjj1 ¼ supt2½0;T �

jjf ðtÞjj. (4)

3. Experience inclusion techniques in ILC

Incorporation of the experience in ILC algorithms makethe learning controllers more intelligent, which are able to

handle the multiple trajectory tasks with less initialtracking errors and quick convergence with iterations.This section briefs the existing locally weighted learningtechnique for experience inclusion and the initial inputselection issues.

3.1. Locally weighted learning approach-brief overview

Local learning is one of the memory-based techniquesthat, once a query is received, extract a predictioninterpolating the neighboring points of the query locally,which are considered relevant according to some distancemeasure (Azeem et al., 2003; Atkeson et al., 1997b;Bontempi et al., 1999). Thus, local learning methodinvolves storing the training data in memory and findingrelevant data in the database by using kernel functions toanswer a particular query. In the tracking problem of robotsystem, the main objective is to find the desired controlinput such that the robot output y(t) tracks the desiredtrajectory yd(t). The desired control input corresponds tothe desired output yd(t) and the desired states xd(t) can becomputed as

udðtÞ ¼ g�1ðxdðtÞ; ydðtÞÞ ¼ ZðxdðtÞ; ydðtÞÞ. (5)

In the above Eq. (5) , the function Zð:Þ represents thenonlinear complex inverse model of the system. Thisinverse model can be represented by a database ofexperiences, which consists of the information correspond-ing to the desired system states such as position, velocityof the robot links and the corresponding desired torqueinput for robot links. The experience database is repre-sented as [X, U], where X represents the state vector ofposition and velocity information and U represents thecorresponding control input vector. Let us consider aquery vector Q consists of new desired trajectory informa-tion. According to the locally weighted learning, a locallinear model can be formed using the query point withthe use of Euclidean distance of the k-nearest neighbordata points (Seidl and Kriegel, 1998) from the querypoint. Weights are assigned for these data points accordingto their distance from the query point. More weightage forthe nearer points and less for the far away points areassigned. Kernel functions (Atkeson et al., 1997a) areutilized for weighting the k-nearest data points. Initialiteration input for the new desired trajectory can becomputed from the local linear model either by locallyweighted regression (Arif et al., 2001) or by the weightedaverage local models (Atkeson et al., 1997a). In the latermethod the local linear models, weighted average thecontrol signals (uj) of past desired trajectories as stored inthe database. These points are searched from the databaseby the k-nearest neighbor search and given weights by thekernel functions. Let wj be the weights for each k-nearestdata points to the query. Thus for a query vector Q, thepredicted value of the control input vector u0(t) could befound by the following weighted average of the linear local

ARTICLE IN PRESSS. Gopinath et al. / Engineering Applications of Artificial Intelligence 21 (2008) 578–590 581

models as:

u0ðtÞ ¼

Pkj¼1wjujPk

j¼1wj

8t 2 ½0;T �, (6)

where uj represents the k-nearest control input points fromthe database, which corresponds to the query vector Q.This predicted control input (u0(t)) can be used as the initialcontrol input for the learning controllers instead of zerolearning assumption in the case of multiple trajectory tasks.

4. Approximate fuzzy data model (AFDM) approach

This section presents the AFDM methodology used inthe proposed method of experience inclusion. It providessignificant initial input for the ILC, necessary for themultiple trajectory tasks. The trajectory tracking informa-tion such as position, velocity and the correspondingtorque inputs are utilized to derive a rule base for AFDM.Thus, the past trajectory tracking experience are utilized toderive an AFDM consisting of a rule base as an alternativeto local linear models as in locally weighted learningapproach.

4.1. Rule base for AFDM

The proposed fuzzy model-based ILC approach is shownin Fig. 2. In consideration of the tracking problem ofrobot, let the experience data set be (Xj, Uj), where Xj is thestate trajectory vector containing 2n elements such as theposition ðx

j1iÞ, velocity ðx

j2iÞ information and Uj denotes

the corresponding control input vector with n elements.Here, j ¼ 1,2,y,m is an index representing the number ofdesired trajectories experienced by the ILC.

Elements n depends on the total time duration of thedesired trajectory (T) and the sampling interval (Dt) andcan be computed as n ¼ T/Dt. Suppose the learningcontroller has to track a new desired trajectory ðX new

d Þ. It

u0 (t)

Fuzzy model

d

newX

ek

(t)

ILC Learning

Algorithm

Memory

Fuzzifier

Defuzzifier

Fig. 2. Fuzzy model based ite

consists of position ðqiÞ and velocity ð _qiÞ information,which are denoted by a query vector Q ¼ ½q1; q2; . . . ; qn;_q1; _q2; . . . ; _qn�. In the rule base of AFDM, we require todefine each rule for each vector pair (Xj, Uj) of thefollowing form:

Rulej

i : IF x1i is mj

i ðx1i; xj1iÞ

AND x2i is mj

i ðx2i; xj2iÞ THEN ux1i ;x2i

i is uj

i

ðfor i ¼ 1; 2; . . . ; n and j ¼ 1; 2; . . . ;mÞ. ð7Þ

Here ux1i ;x2i

i is the ith input element corresponding to theinput elements (x1i, x2i) and u

ji is a crisp value of input

vector Uj. Also, m ji ðx1i; x

j1iÞ and m j

i ðx2i; xj2iÞ are the Gaussian

membership functions defined by

m ji x1i; x

j1i

� �¼ exp �

x1i � xj1i

b

!20@

1A;

m ji x2i; x

j2i

� �¼ exp �

x2i � xj2i

b

!20@

1A. ð8Þ

From Eq. (8), it is clear that when x1i ¼ xj1i and x2i ¼ x

j2i,

the corresponding membership function values,m j

i ðx1i; xj1iÞ ¼ m j

i ðx2i; xj2iÞ ¼ 1:0.

4.2. Selection of initial control input by AFDM approach

AFDM technique is basically used to evaluate therelative significance of an input candidate. An approximateinverse model of the nonlinear system as given in Eq. (5)can be learnt by the AFDM using the IF-THEN rules.Each time when a new trajectory has to be tracked, anapproximate desired control input ~udðtÞ will be generatedby the AFDM, which then be corrected by the learningcontroller in a way that the control input ukðtÞ converges tothe desired control input udðtÞ as iteration progress. Hence,the fuzzy data model provides more significant andappropriate prediction of control input required to trackthe new desired trajectory. Thus, the convergence of

uk (t) X

k (t)

Plant

IF-THENRule Base

rative learning controller.


learning controller will be improved as compared to that ofzero initial input learning for acceptable bounded errors.The significant initial input u0ðtÞ corresponding to newdesired trajectory task has been computed from thedefuzzifier of the AFDM as shown in Fig. 2. The centroiddefuzzification formula for the output of AFDM definedby rules as in Eq. (7) is given by

u0 ið Þ ¼

Pmj¼1

m ji qi; x

j1i

� �^ m j

i _qi;xj2i

� �� u

qi ; _qi

i

Pmj¼1

m ji qi;x

j1i

� �^ m j

i _qi; xj1i

� �� ; for i ¼ 1; 2; . . . ; n,

(9)

where 4 is the minimum operator. Thus, Eq. (9) provesthat AFDM is a special case of generalized fuzzy model(GFM) (Kreinovich et al., 1998; Lin and Cunningham,1995). The defuzzified output of AFDM is called as fuzzycurve (Lin and Cunningham, 1995), which is used as theinitial input ðu0ðtÞÞ for the ILC algorithm. This methodgives more proper and efficient initial control input signalof the learning controller to the system. This leads to thegreater reduction in initial errors with reduced number ofiterations, which is required to reach acceptable error levelfor tracking new desired trajectories as compared to zerolearning as well as with locally weighted learning-basedILC approaches.

Remarks (1). In AFDM based approach, the total numberof rules is equal to mn, where n represents the number ofsampling points and m represents the number of experiencetrajectories. For each new experience trajectory, a new setof rules will be added into the rule base of AFDM. As thenumber of trajectories in the experience set increases, thenumber of rules will increase significantly. The size ofthe rule base can be reduced by the following ways:

�
use of type-1 FLS approach (Mendel, 2001), � use of clustering methods (Azeem et al., 2003) and
judicious selection of trajectories for experience set and
� defining the trajectory-tracking problem in frequency
domain (Gopinath et al., 2005), so that the numberof points to represent each trajectory is reducedsignificantly.

4.3. Type-1 fuzzy logic system approach

To overcome the difficulty of choosing one rule persample point, a type-1 FLS approach is defined in place theAFDM approach. In this paper, multiple-input–single-output (MISO) FLSs are considered, because a multiple-output system can always be separated into a collection ofsingle-output systems. A MISO–FLS can be seen as afunction f : U � <n ! V � <, where U is the input space,V is the output space. The fuzzifier is a mapping from theobserved crisp input space U � <n to the fuzzy sets definedin U, where a fuzzy set defined in U is characterized by amembership function mF : U ! ½0; 1� and is labeled by a

linguistic term F such as ‘‘small’’, ‘‘medium’’, ‘‘large’’,‘‘very large’’, etc. The fuzzy rule base is a set of linguisticrules in the form of ‘‘IF a set of conditions are satisfied,THEN a set of consequences are inferred’’. The positionand velocity information of the experience trajectories areconsidered as the inputs and the corresponding controlinput signal as the output. Thus for a two-input and single-output type-1 FLS, the fuzzy rule base may consist of thefollowing M rules:

Rulel : IF x1 is Al1 AND x2 is Al

2 THEN u is Bl

l ¼ 1; 2; . . . ;M, ð10Þ

where x1 and x2 are the input variables and u is the outputvariable of the fuzzy system. Al

1; Al2 and Bl are the

linguistic terms characterized by the fuzzy membershipfunctions mAl

1ðx1Þ; mAl

2ðx2Þ and mBl ðuÞ respectively. In type-1

FLS approach, the number of rules (M) depends on thenumber of linguistic terms defined to label the experiencedatabase, instead of number of data points as the case inAFDM. The fuzzy inference engine is a decision-makinglogic, which employs fuzzy rules from the fuzzy rule base todetermine the fuzzy outputs of a type-1 fuzzy systemcorresponding to the fuzzified inputs. The defuzzifierperforms a mapping from the fuzzy sets in the outputspace V to the crisp points in V. The following definition isused for the computation of initial iteration input (u0(t)) forthe ILC corresponding to a query Q.

Definition. The output of a MISO type-1 FLS withsingleton fuzzifier, product t-norm, center of sets defuzzi-fier and Gaussian membership function can be written inthe form:

u q; _qð Þ ¼

PMl¼1

cl mAl1

qð ÞmAl2_qð Þ

� �PMl¼1

mAl1

qð ÞmAl2_qð Þ

� � , (11)

where Q ¼ ½q1; q2; . . . ; qn; _q1; _q2; . . . ; _qn� are the inputs forthe fuzzy system, cl is the centroid of consequent part (apoint in the output space V at which mB l ðuÞ achieves itsmaximum value) and mAl

1ðqÞ:mAl

2ð _qÞ are the Gaussian

membership functions, defined by

mAl1ðqÞ ¼ exp �

1

2

q�ml1

sl1

� �2" #

,

mAl2ð _qÞ ¼ exp �

1

2

_q�ml2

sl2

� �2" #

, ð12Þ

where ml1; ml

2 are the mean and sl1; s

l2 standard deviations

of fuzzy sets of lth rule, are real-valued parameters.

4.4. Designing type-1 FLS using back-propagation method

The Gaussian membership functions defined in Eq. (12),has to be provided with the mean and standard deviation


parameters for each fuzzy set. Certainly, we will be havinga number of antecedent as well as consequent parameters,which has to be provided by us initially from theknowledge extracted from the initial experience database.In the back-propagation method, both antecedent andconsequent parameters are tuned using a steepest descentmethod. The defuzzified output of the FLS can beexpressed in Eq. (11). As mentioned above, with theexperience data set (X, U), where X is the state trajectoryvector contains the position ðx

ðiÞ1 Þ, velocity ðx

ðiÞ2 Þ informa-

tion, Uj denotes the corresponding control input vector.Here i ¼ 1,2,y,n denotes the number of samples intrajectory, control input vectors. For an input–outputexperience database as training pair [X(i), U(i)], we wish todesign the type-1 FLS such that the following errorfunction is minimized:

eðiÞ ¼ 12½uðiÞ �U ðiÞ�2 for i ¼ 1; 2; . . . ; n. (13)

It is evident from Eq (11) that control input vector u(.) iscompletely characterized by the design parameters such ascentroid (cl), mean ðml

kÞ and standard deviation ðslkÞ (for all

the rules l ¼ 1,y,M and k ¼ 1, 2). In the back-propaga-tion method both antecedent and consequent parametersare tuned using steepest descent method. In steepestdescent algorithm the following recursive relations areused to update all the design parameters of type-1 FLS,such that the error function defined in Eq. (13) isminimized. Let the recursive count be, p ¼ 0,1,y,

mlkðpþ 1Þ ¼ ml

kðpÞ � am½uðiÞ �U ðiÞ�½clðpÞ � uðiÞ�

�½xðiÞk �ml

kðpÞ�

ðslkðpÞÞ

2ðmAl

1ðxðiÞ1 Þ:mAl

2ðxðiÞ2 ÞÞ, ð14Þ

clðpþ 1Þ ¼ clðpÞ � ac½uðiÞ �U ðiÞ�

�ðmAl1ðxðiÞ1 Þ:mAl

2ðxðiÞ2 ÞÞ, ð15Þ

slkðpþ 1Þ ¼ sl

kðpÞ � as½uðiÞ �U ðiÞ�½clðpÞ � uðiÞ�

�½xðiÞk �ml

kðpÞ�2

ðslkðpÞÞ

3

" #

�ðmAl1ðxðiÞ1 Þ:mAl

2ðxðiÞ2 ÞÞ. ð16Þ

In the above equations, the initial values such as mlkð0Þ,

clð0Þ and slkð0Þ must be provided. Also the values of am, ac

and as must be chosen with some care.

4.5. Initial iteration error reduction and convergence issues

The convergence of the learning controller is stated bythe following condition (Arimoto et al., 1984; Arif et al.,2001):

jjekjjsprjjek�1jjs for ro1. (17)

To validate the effectiveness of experience inclusion theabove inequality condition can be rewritten as

jjekjjsprkjje0jjs for ro1. (18)

Eq. (18) implies that the error at kth iteration depends onthe parameter r, which controls the convergence rate andinitial error e0(t). The following lemma has been stated forthe selection of r.

Lemma. (Arimoto et al., 1984) For the nonlinear timevarying repetitive uncertain system given in Eq. (1), byusing the D-type ILC law incorporating the condition

r9 I r � CBGk k1o1 for 8t 2 ½0;T � (19)

the final tracking error will converge to zero as iterationk-N. &

The proposed fuzzy model based experience inclusionhas a special interest on the other parameter e0(t). Byselecting the significant initial control input u0(t) using theprevious tracking experiences from IF-THEN rule base offuzzy data model, the initial error e0(t) will be less andreduce the error more effectively as the iteration (k)progress. IF-THEN fuzzy rule base gives an inversemapping between the experience trajectory informationsuch as position and velocity with the correspondingcontrol input information due to their universal approx-imation property (Wang and Mendel, 1992; Kreinovichet al., 1998). The fuzzy model provides an approximation~f ðxÞ of a unknown nonlinear function f(x) uniformly onX � <n such that,

jjf ðxÞ � ~f ðxÞjjpd 8x 2 X , (20)

where d40, is a bound for the approximation error.The function f(x) is described by the trajectory and thecontrol input information dataset. AFDM or type-1 FLSfuzzy models approximate the function f(x) by using theIF-THEN fuzzy rules defined in Eqs. (7) and (10). In thecurrent context, the function to be approximated is givenby Eq. (5) where the function output is the desired controlinput and the domain of function is the specified desiredstate trajectory. Define the approximation error in theinitial control input obtained from the fuzzy data model asin Eqs. (9) and (11) can be stated as

Du0ðtÞ ¼ udðtÞ � u0ðtÞ

¼ ZðxdðtÞ; ydðtÞÞ � ~ZðxdðtÞ; ydðtÞÞ, ð21Þ

where ~Zð:Þ is the approximation of inverse mapping by thefuzzy model. Applying norm for above Eq. (21) we get

jjDu0ðtÞjj ¼ jjudðtÞ � u0ðtÞjj

¼ jjZðxdðtÞ; ydðtÞÞ � ~ZðxdðtÞ; ydðtÞÞjjpd. ð22Þ

Thus, the approximation error of the initial control inputu0(t) obtained from the fuzzy model is bounded. By givingthis initial input u0(t) to the system (1), the output error ofthe system will also be bounded such that

jje0ðtÞjj ¼ jjydðtÞ � y0ðtÞjjp�, (23)

ARTICLE IN PRESS

Fig. 3. 2-dof robot manipulator model.

Table 1

Manipulator link details

Parameter Value

Mass of link-1 3.0Kg

Mass of link-2 1.5Kg

Length of links 0.3m

Gravity (G) 9.8m s�2

Table 2

Actuator specifications

Parameters Values

Resistance R1 ¼ R2 ¼ 1.0OInductance L1 ¼ L2 ¼ 0.01H

Torque constant KT ¼ 1.0NmA�1

Turns ratio (N) 1

S. Gopinath et al. / Engineering Applications of Artificial Intelligence 21 (2008) 578–590584

where e is small constant depending on d (sincejjDu0ðtÞjj ! 0) jje0ðtÞjj ! 0).

Thus Eq. (18) becomes

jjekjjsprk�. (24)

For ro1, the error in kth iteration will decrease to zeroas iterations k-N. A proposition has been stated to showthe contribution of experience inclusion in ILC. The initialiteration error in zero learning of the learning controller isdenoted by eZ

0 ðtÞ and the same from the fuzzy model basedlearning controller by e0(t).

Proposition. For a new desired trajectory yd(t), the initialiteration error of the learning controller with the fuzzy datamodel-based experience inclusion will be less in comparisonwith the learning controller with zero learning, i.e.,jje0ðtÞjj5jje

Z0 ðtÞjj if the condition �ojjydðtÞjj is satisfied,

where e is a small positive constant.

Proof. Learning controller with zero initial input, uZ0 ðtÞ ¼

0; 8t 2 ½0;T � will lead to the zero output yZ0 ðtÞ ¼ 0 of the

system. This will leads to the initial iteration error, whosenorm is defined as

jjeZ0 ðtÞjj ¼ jjydðtÞjj. (25)

The initial iteration error in experience inclusion basedlearning controllers can be found from Eqs. (22) and (23) isrepresented as

jje0ðtÞjj ¼ jjydðtÞ � y0ðtÞjjp�, (26)

where e is the small positive constant representing theapproximation error in the output.

By taking the ratio of the errors in Eqs. (25) and (26) weget

jje0ðtÞjj

jjeZ0 ðtÞjj

¼jjydðtÞ � y0ðtÞjj

jjydðtÞjjp

�

jjydðtÞjj. (27)

In Eq. (27), if the condition �ojjydðtÞjj holds, thenthe ratio of the initial iteration errors of ILC with andwithout experience inclusion can be represented by

jje0ðtÞjjojjeZ0 ðtÞjj. Usually, jjydðtÞjj is not small and hence,

jje0ðtÞjj5jjeZ0 ðtÞjj.

5. Numerical experimentation

This section demonstrates the effectiveness of theexperience inclusion in learning controllers based on theproposed method over local weighted learning and zerolearning-based ILCs. Implementation of fuzzy data model-based experience inclusion in ILC algorithms, thus theinitial error reduction and error convergence issues, for thetrajectory-tracking problem of industrial robot manipula-tor system are discussed in this section.

5.1. Physical plant

In this case study, a trajectory-tracking problem of a2-dof robot manipulator system is considered as shown inthe Fig. 3. Two DC motors are used as the actuators of twojoints. The robot manipulator and the actuator modelparameters are given in the Tables 1 and 2.All the above physical parameters of the robot model are

used for the simulation purpose. The knowledge of theseparameters is not needed for computation of control. Adigital computer simulation is performed, to show theperformance of the proposed fuzzy rule based modelingapproaches (AFDM, type-1 FLS)-based Gradient descentILC algorithms (Gopinath and Kar, 2004) for thetrajectory-tracking problem of the robot manipulator.Using the model parameters, we get the following Inertia,Coriolis/centripetal, Friction (Viscous, Coulomb) matricesand Gravity vector terms and the feedback controller gainterms which ensures stability are given as (Gopinath andKar, 2004)

MðqÞ ¼0:54þ 0:27 cos q2 0:135þ 0:135 cos q2

0:135þ 0:135 cos q2 0:135

" #;


Cðq; _qÞ ¼�0:135 _q2 sin q2 �0:135ð _q1 þ _q2Þ sin q2

0:135 _q1 sin q2 0

" #;

GðqÞ ¼13:1625 cos q1 þ 4:3875 cos ðq1 þ q2Þ

4:3875 cosðq1 þ q2Þ

" #;

F ð _qÞ ¼ diagð 4 4 Þ � ð _qÞ þ 5sgnð _qÞ;

K ¼15 0

0 15

� �; L ¼

5 0

0 5

� �.

Simulations are performed with the sampling intervalDt ¼ 0.01 s and the learning gain (G) is selected as 0.2.For comparing the performance of the learning controllerwithout, with experience inclusion using AFDM approachand with existing locally weighted learning approach, aperformance index I is defined as

root mean square error ðRMSEÞ

¼ I ¼

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi1

T

Ze2ðtÞdt

s�

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi1

n

Xn

i¼0

e2i

sfor n ¼ T=Dt, ð28Þ

where the tracking error e(t ) is defined as the differencebetween desired and actual angular displacements of therobot links. i.e., eðtÞ ¼ qdðtÞ � qðtÞ.

5.2. Formulation of fuzzy rule base from experience

database

As a first step, the learning controller has been given aset of initial desired trajectories as shown in Fig. 4. Startingwith zero learning assumption, the ILC has performedsufficient number of iterations in such a way that accept-able tracking performance of robot has been obtained.Desired angular displacement and velocity information(Xd) and corresponding desired input information (Ud) fortracking the initial desired trajectory are stored as a

Fig. 4. Initial experience trajectories.

database of initial experience. This initial experiencedatabase is used to form the fuzzy rule base for the fuzzymodel. IF-THEN fuzzy rules of the rule base areformulated as per Eq. (7) in AFDM-based approach orby Eq. (10) in type-1 FLS-based approach. In case ofAFDM-based approach, the total number of fuzzy rules isequals to the number of elements of i nitial experiencetrajectory defined as n ¼ T=Dt, where T is total timeduration of the trajectory. But in type-1 FLS approach, itrequires only n2 number of rules/link (n is the number offuzzy sets used to define the initial experience trajectory).In our simulation studies we choose n ¼ 6, leads to only62 ¼ 36 rules/link to define the experience trajectory.

5.3. Implementation and error reduction issues

To show the essential role of experience inclusion in ILCusing the proposed method, a new desired test trajectoriesT-1 for links 1, 2 are considered is shown in Fig. 5. Insteadof zero initial input assumption, fuzzy model-basedapproaches provide more significant initial input for thelearning controllers. A step-by-step procedure to computethe significant initial input is as follows:

Step (1): A query vector Q has been formed thatcontains the position and velocity information of the newdesired test trajectory T-1.

Step (2): The membership values corresponding to thequery vector elements are calculated with respect to theelements of experience database using Gaussian member-ship functions as defined in Eqs. (8) and (12).

Step (3): IF-THEN fuzzy rules formulated from initialexperience database, are used to find the control inputelements corresponding to query vector Q. The member-ship values obtained from step (2) are used as weights forthe control input elements obtained from fuzzy rules.

Step (4): Using these weighted control input elements asobtained from step (3) the defuzzifier of AFDM computesa significant initial input u0(t) as shown in Fig. 7, using thedefuzzification formulas as in Eqs. (9) and (11).

Fig. 5. Test trajectory T-1.

ARTICLE IN PRESS

Fig. 6. Performance index (I) comparison for T-1.

Fig. 7. Initial input prediction from AFDM for test trajectory T-1.




S. Gopinath et al. / Engineering Applications of Artificial Intelligence 21 (2008) 578–590586

Thus, the past trajectory tracking experience in fuzzyrule base form of fuzzy model is utilized for thecomputation of significant initial input. In search ofdesired input ud(t) corresponding to test trajectory T-1,the learning controller starts learning from significantinitial input, not from zero initial input. Thus, the initialiteration error of the robot on test trajectory T-1 tracking issignificantly reduced in comparison with zero learningapproach. Then, the learning controller is allowed toperform certain learning iterations to obtain an acceptableerror bound. Comparing the performance of the proposedmethod with existing local weighted learning, zero learningILC approaches has been made by the performance index


(I) as defined in Eq. (33). From Fig. 6, the use of pasttrajectory tracking experience in learning controller resultsin drastic reduction of initial errors as well as it improvesthe bounded error convergence. Also, we can observe thatfuzzy model-based ILC approaches performs better incomparison with locally weighted learning-based ILCapproach. Information about the test trajectories T-1 andthe corresponding desired input information is also storedwith existing experience database thus new set of fuzzyrules have been added into the rule base of the fuzzy model.

As a second testing phase for the proposed method, newtest trajectories T-2 have been considered as shown inFig. 8. By following the similar steps as discussed above,for test trajectories T-2, the fuzzy model provides a suitableinitial input for the learning controller in such a way that itlearns the test trajectories T-2 with less initial iteration



error. Fig. 9 shows the tracking performance of thelearning controller for test trajectories T-2. Since theproposed approach provides more significant initial inputthan local weighted learning approach, and thus shows itsbetter performance over the other. The initial inputpredictions for test trajectories T-2 by AFDM approachis shown in Fig. 10. The tracking experience for testtrajectories T-2 is also stored in the experience database.In the third testing phase, the test trajectories T-3 are

considered as shown in Fig. 11. As compared with theexperienced trajectories, the desired test trajectory T-3is totally different in shape as well as in time span. FromFig. 12, we can observe that the learning controller withexperience inclusion by the fuzzy models performs better ininitial error reduction in comparison with local weightedlearning and the controller without experience. Predicted




initial inputs for the manipulator links correspond to thetest trajectories T-3 are shown in Fig. 13.

Similar testing process has been conducted for the testtrajectories T-4 and T-5 as shown in Figs. 14 and 17.Corresponding plots for comparing the performances ofthe learning controller using the proposed fuzzy model-based approaches over the other methods are shown inFigs. 15 and 18 for the test trajectories T-4 and T-5,respectively. Initial inputs obtained by AFDM approachfor test trajectories T-4 and T-5 are also shown in Figs. 16and 19. For each new trajectory the fuzzy model-basedILC, utilizing the past trajectory tracking experience andafter the learning new trajectory information is also storedin the existing experience database thus makes it richer andricher.

From the above simulation studies, we can observe thatthe fuzzy model-based approaches (AFDM, type-1 FLS)



provide more significant and smooth initial control inputsin comparison with locally weighted learning approach. Itis due to smoothing and extrapolation properties of fuzzy-based approaches. The performance index values fordifferent test trajectories at initial iteration for the learningcontroller with experience inclusion using fuzzy rule-basedmodeling (AFDM, type-1 FLS) approaches, local weightedlearning approach and without experience are numericallycompared in Table 3. From all the testing phases, it is clearthat the learning controller with significant initial controlinput, predicted from fuzzy model performs well. Theproposed ILC method not only reduces the initial error butalso improves bounded error convergence with iterations.Also, it is clear that the prediction efficiency of fuzzymodel-based approach is far better than locally weightedlearning approaches. The lack of appropriate experience



ARTICLE IN PRESS

Table 3

Comparison of performance index (I) of ILC at initial iteration

Test

trajectories

number

Without

initial input

With initial input

Local learning

approach

AFDM

approach

Type-1 FLS

approach

T-1 1.1669 0.3488 0.2146 0.1987

T-2 0.9856 0.3554 0.1845 0.1521

T-3 0.6943 0.5257 0.2297 0.2038

T-4 0.6937 0.4197 0.0554 0.0663

T-5 0.6296 0.3727 0.1001 0.0979


S. Gopinath et al. / Engineering Applications of Artificial Intelligence 21 (2008) 578–590 589

may adversely affect the performance of local weightedlearning, while the fuzzy model based approach escapessuch effect due to its extrapolation properties.

6. Conclusion

Methods of experience inclusion using fuzzy modelingapproaches have been proposed for the ILC. Experienceinclusion on initial error reduction and improvement onconvergence for bounded error issues are mathematicallyjustified. Significant initial control inputs are generated bythe fuzzy model approaches than locally weighted learningapproaches. Thus, the proposed method reduces the initialerror significantly and improves the bounded errorconvergence with iterations. Proposed approaches performwell in predicting the initial control inputs from the pasttracking experience, thus could be considered as a betteralternative to the local learning approaches. As a result ofexperience inclusion schemes, it is possible to get betterperformance without modifying the structure of learningcontrollers. Numerical experiments on two-link robotmanipulator model show the efficacy of the proposedfuzzy model-based ILC than other approaches. Further

research on the initialization issues of ILCs could befocused on (i) the use of different fuzzy models liketype-2 FLS, (ii) the use of clustering methods and (iii) bydefining the trajectory-tracking problem in transformeddomains.

References

Arif, M., Ishihara, T., Inooka, H., 2001. Incorporation of experience in

iterative learning controllers using locally weighted learning. Auto-

matica 37, 881–888.

Arimoto, S., 1996. Control Theory of Nonlinear Mechanical Systems—A

Passivity-based and Circuit-theoretic Approach. Clarendon Press,

Oxford.

Arimoto, S., Kawamura, S., Miyazaki, F., 1984. Bettering operation of

robots by learning. Journal of Robotic Systems 1 (2), 123–140.

Atkeson, C.G., McIntyre, J., 1986. Robot trajectory learning through

practice. In: Proceedings of the 1986 IEEE International Conference

on Robotics and Automation, San Francisco, CA, April 1986,

pp. 1737–1742.

Atkeson, C.G., Moore, A.W., Schaal, S., 1997a. Locally weighted

learning. Artificial Intelligence Review 11 (1), 11–73.

Atkeson, C.G., Moore, A.W., Schaal, S., 1997b. Locally weighted learning

for control. Artificial Intelligence Review 11 (1), 75–113.

Azeem, M.F., Hanmandlu, M., Ahmad, N., 2003. Structure identification

of generalized adaptive neuro-fuzzy inference systems. IEEE Transac-

tions on Fuzzy Systems 11 (5), 666–681.

Bien, Z., Huh, K.M., 1989. Higher-order iterative learning control

algorithm. IEE Proceedings, Part D 136 (3), 105–112.

Bondi, P., Casalino, G., Gambaradella, L., 1988. On the iterative learning

control theory for robotic manipulator. IEEE Transactions on

Robotics and Automation 4 (1), 14–22.

Bontempi, G., Birattari, M., Bersini, H., 1999. Lazy learning for local

modeling and control design. International Journal of Control 72 (7),

643–658.

Bristow, D.A., Tharayil, M., Alleyne, A.G., 2006. A survey of

iterative learning control. IEEE Control Systems Magazine 26 (3),

96–114.

Bukkems, B., Kostic, D., de Jager, B., Steinbuch, M., 2005. Learning-

based identification and iterative learning control of direct-drive

robots. IEEE Transactions on Control Systems Technology 13 (4),

537–549.

Gopinath, S., Kar, I.N., 2004. Iterative learning control scheme for

manipulators including actuator dynamics. Mechanism and Machine

Theory 39, 1367–1384.

Gopinath, S., Kar, I.N., Bhatt, R.K.P., 2005. Experience inclusion in

fourier series based iterative learning control for manipulators. In:

Proceedings of IEEE Indicon Conference, pp. 294–297.

Heinzinger, G., Fenwick, D., Paden, B., Miyazaki, F., 1992. Stability of

learning control with disturbances and uncertain initial conditions.

IEEE Transactions of Automatic Control 37 (1).

Jang, T.J., Choi, C.H., Ahn, H.S., 1995. iterative learning control in

feedback systems. Automatica 31 (2), 243–248.

Kreinovich, V., Mouzouris, G.C., Nguyen, H.T., 1998. Fuzzy rule based

modeling as a universal approximation tool. In: Fuzzy Systems—

Modeling and Control. Kluwer Academic Publishers, Massachusetts,

pp. 135–195.

Kuc, T.Y., Nam, K., Lee, J.S., 1991. an iterative learning control of

robotic manipulators. IEEE Transactions on Robotics and Automa-

tion 7, 835–841.

Lin, Y., Cunningham III, G.A., 1995. A new approach to fuzzy-

neural system modeling. IEEE Transactions on Fuzzy Systems 3 (2),

190–198.

Mendel, J.M., 2001. Uncertain Rule-Based Fuzzy Logic Systems:

Introduction and New Directions. Prentice-Hall PTR, New Jersey.


Moore, K.L., 1998. Iterative learning control-an expository overview.

Applied and Computational Controls, Signal Processing and Circuits

1, 425–488.

Seidl, T., Kriegel, H., 1998. Optimal multi-step k-nearest neighbor search.

In: Proceedings of ACM SIGMOD International Conference on

Management of Data, USA.

Tayebi, A., 2004. Adaptive iterative learning control for robot manip-

ulators. Automatica 40, 1195–1203.

Wang, L., Mendel, J.M., 1992. Generating fuzzy rules by learning from

examples. IEEE Transactions on Systems, Man, and Cybernetics 22

(6), 1414–1427.

Xu, J.X., 1998. Direct learning of control efforts for trajectories with different

time scales. IEEE Transactions of Automatic Control 43 (7), 1027–1030.

Xu, J.X., Tan, Y., 2003. Linear and nonlinear iterative learning control.

Series of Lecture Notes in Control and Information Sciences, vol. 291.

Springer, Berlin, Germany.

experience inclusion in iterative learning controllers: fuzzy model based approaches

Documents