teaching learning based optimization...
TRANSCRIPT
Chapter - 3
Teaching LearningBased Optimization
Algorithm
42
CHAPTER - 3
Teaching Learning Based Optimization Algorithm
3.1 Introduction
Optimization speaks of discovering one or more feasible solutions which correspond to
extreme values of one or more objectives. The requirement for seeking such optimal solutions
in a problem comes mostly from the extreme need of either plotting a solution for minimum
possible value, or for maximum possible value, or others. Because of such extreme
characteristics of optimal solutions, optimization processes are of high consideration in
practice. When an optimization problem involves only one objective function, the work of
discovering the optimal solution is defined as single objective optimization. When an
optimization problem involves more than one objective function, the work of discovering
single or multiple optimum solutions is stated as multi objective optimization. Most real
world problems naturally involve multiple objectives.
Analytical or numerical methods have been applied to engineering computations since
a long time to compute the extreme values of a function. These methods may perform well in
many practical cases but they fail to solve such large scale problems especially with non-
linear objective functions. In real world problems, the number of parameters can be very large
and their influence on the objective function can be very complicated having nonlinear
character. The objective function may have many local minimum or maximum values,
whereas the researcher is always interested in the global optimal values within the search
space. It is not possible to handle such problems by classical methods (e.g. gradient methods)
43
because they converge at local optimal values. In such complex cases, advanced optimization
algorithms offer solutions to the problems because they find a solution near to the global
optimum within reasonable time and computational effort. These techniques are
comparatively new and gaining popularity due to certain properties which the deterministic
algorithm does not have. Some of the well known meta-heuristics developed during the last
three decades are: Genetic Algorithm (GA), Differential Evolution (DE), Particle Swarm
Optimization (PSO), Harmony Search (HS), Simulated Annealing (SA), Artificial Bee Colony
(ABC), Ant Colony Optimization (ACO), Shuffled Frog Leaping (SFL), Biogeography-Based
Optimization (BBO) and Gravitational Search Algorithm (GSA) . These algorithms have been
applied to many engineering optimization problems and proved effective to solve some
specific kind of problems.
The main limitation of all the above algorithms that different parameters are required
for proper working of these algorithms. Proper selection of the parameters is essential for the
searching of the optimum solution by all algorithms. Minor change in the algorithm
parameters changes the effectiveness of the algorithm. Sometimes, the difficulty for the
selection of parameters increases with modifications and hybridization. Therefore, the efforts
must be continued to develop an optimization technique which is free from the algorithm
parameters, i.e. no algorithm parameters are required for the working of the algorithm.
An optimization method, Modified Teaching–Learning-Based Optimization
(MTLBO), is proposed in this thesis to obtain global solutions for continuous non-linear
functions with less computational effort and high consistency. The TLBO method works on
the philosophy of teaching and learning. The TLBO method is based on the effect of the
influence of a teacher on the output of learners in a class. Here, output of class is considered
in terms of results or grades. The teacher is normally considered as a highly learned person
who shares his or her knowledge with the learners. The quality of a teacher capability affects
the outcome of learners. It is obvious that a good teacher trains learners such that they can
have better results in terms of their marks or grades. Learners also learn from interaction
between themselves.
The rest of the chapter is organized as follows. In Section 3.2, Teaching Learning
Based Optimization algorithm is discussed. The proposed modified teaching learning based
optimization (MTLBO) algorithm is described in Section 3.3. In Section 3.4, Multi-objective
44
teaching learning based optimization algorithm is explained. Finally, conclusions are drawn in
section 3.5.
3.2 Teaching Learning Based Optimization Algorithm
The Teaching Learning Based Optimization (TLBO) algorithm is a new efficient
population based algorithm developed by Rao et al. [71, 72]. The algorithm mimics the
teaching-learning ability of the teacher and learners in a classroom. In this method, a group of
students (learners) in a class is considered as a population and design variables are the
subjects offered to the student’s (learners). A students (learners) result is analogous to fitness
value and the value of objective function represents the knowledge of a particular students. As
the teacher is considered the most learned person in the society, the best solution so far is
analogous to Teacher in TLBO. The process of TLBO is divided into two parts. The first part
consists of the ‘Teacher Phase’ and the second part consists of the ‘Learner Phase’. The
‘Teacher Phase’ means learning from the teacher and the ‘Learner Phase’ means learning
through the interaction between learners. In the sub-sections below we briefly discuss the
implementation of TLBO.
3.2.1 Initialization
The population X is randomly initialized by a search space bounded by matrix of N
rows and D columns. The value N represents the number of learners in a class i.e. “class
size” or ‘‘population size ’’ in this case. The value D represents the number of “subjects or
courses offered” to the learners, which is same as the dimensionality of the problem taken.
The procedure being iterative is set to run for G maximum number of iterations
(generations).Thethj
parameter of the
thi learner (vector) in the initial generation is assigned
values randomly using the equation
( )min1 max min( , )i j j j jx x rand x x= + × − (3.1)
Where rand represents a uniformly distributed random variable within the range (0, 1).
minjx and
maxjx represent the minimum and maximum value for
thj
parameter. The
parameters of the thi learner (vector) for the iteration g are given by
( ) ( ,1) ( ,2) ( , ) ( , ), , , , ,ig g g g g
i i i j i DX x x x x = K K (3.2)
45
For all the equations used in the algorithm, 1, 2,3,........,i N= , 1,2,3,........,j D= and
1,2,3,........,g G= . The random distribution followed by all the rand values is the uniform
distribution.
3.2.2 Teacher phase
The mean vector gM containing the mean of the learners in the class for each subject
at generation g is computed. The mean vector gM is given as
( )
( )
( )
1,1( ) ( ,1) ( ,1)
(1, ) ( , ) ( , )
(1, ) ( , ) ( , )
, , , ,
, , , ,
, , , ,
Tg g g
i N
g g g gj i j N j
g g gD i D N D
mean x x x
M mean x x x
mean x x x
=
K K
K
K K
K
K K
(3. 3)
which effectively gives us
1 2, , , , ,g g g g gj DM m m m m = K K (3.4)
The best vector with the minimum objective function value is taken as the teacher ( gTeacherX )
for respective iteration. The algorithm proceeds by shifting the mean of the learners towards
its teacher. A randomly weighted differential vector is formed from the current mean and the
desired mean vectors and added to the existing population of learners to get a new set of
improved learners.
( )( )( )g g gg
new i Teacher Fi X rand X T MX = + × − (3.5)
where FT is the teaching factor which decides the value of mean to be changed randomly at
each iteration . Value of FT can be either 1 or 2. The value of FT is decided randomly with
equal probability as
[ ]21F round rT = + (3.6)
Where 2r is random number between [0, 1]. FT is not a parameter of the TLBO algorithm.
The value of FT is not given as an input to the algorithm and its value is randomly decided by
the algorithm using equation (3.6). After conducting a number of experiments on many
46
benchmark functions it is concluded that the algorithm performs better if the value of FT is
between 1 and 2. However, the algorithm is found to perform much better if the value of FT is
either 1 or 2 and hence to simplify the algorithm, the teaching factor is suggested to take
either 1 or 2 depending on the rounding up criteria given by Equation (3.6).
If ( )
g
new iX is found to be a superior learner than ( )giX in generation g , than it replaces inferior
learner ( )giX in the matrix.
3.2.3 Learner phase
This phase consists of the interaction of learners with one another. The process of
mutual interaction tends to increase the knowledge of the learner. Each learner interacts
randomly with other learners and hence facilitates knowledge sharing. For a given learner,
( )giX another learner
( )grX is randomly selected (i ≠r). The th
i vector of the matrix ( )newgiX in
the learner phase is given as
( ) ( )( )
( ) ( ) ( ) ( ) ( )
( )
( ) ( ) ( )
new
g g g g gi i r i r
gi
g g gi r i
X rand X X if X X
XX rand X X otherwise
+ × − < =
+ × − (3.7)
3.2.4 Algorithm termination
The algorithm is terminated after maximum iterations (G ) are completed.
3.2.5 Algorithm of TLBO
The TLBO algorithm that is introduced here is shown in the flow chart of figure 3.1.
The following steps give explanations to the TLBO algorithm.
Step 1 : Initialize the population size or number of students in the class ( N ), number of
generations (G ), number of design variables or subjects (courses) offered which coincides
with the number of units to place in the distribution system (D ) and limits of design variables
(upper, LU and lower, LL of each case).
Define the optimization problem as: Minimize ( )f X , where ( )f X is the objective
function, X is a vector for design variables such that LL ≤ X ≤ LU .
47
Fig. 3.1:Flow diagram of the TLBO algorithm
Step 2 : Generate a random population according to the number of students in the class (N )
and number of subjects offered (D ). This population is mathematically expressed as
Accept each new individual if it gives a better function value than the original
Accept each new individual if it gives a better function value than the original
Learner phase:
Modify the population based on Eq. 3.13 and discretize it
Calculate the objective function value of all individuals
Final value of solutions
END
Is the termination
Criterion satisfied ?
NO
YES
BEGIN
Define optimization problem
Generate randomly the initial population
Initialize the optimization parameters: , , , LLN G D andU L
Teacher phase:
Modify the population based on Eq. 3.12 and discretize it
Calculate the objective function value of all individuals
Calculate the objective function value of all individuals
48
2,
1,1 1,2 1,
2,1 2,2
,1 ,2 ,
. .
. .
. . . . .
. . . . .
. .
D
D
N N N D
X X X
X X X
V
X X X
=
(3.8)
Where ,i jX
is the initial grade of the thj subject of the th
i student.
Step 3: Evaluate the average grade of each subject offered in the class. The average grade of
the thj subject at generation g is given by:
( )1, 2, ,, , .. ..... ..,g
j j i jmean X XM X= (3.9)
Step 4: Based on the grade point (objective value) sort the students (population) from best to
worst. The best solution is considered as teacher and is given by:
( ) mintearcher f XXX =
= (3.10)
Step 5: Modify the grade point of each subject (control variables) of each of the individual
student. Modified grade point of the thj subject of the thi student is given by:
( )( )( )new ig g gg
Teacher Fi X rand X T MX = + × − (3.11)
( )( )( ) 1 2(1 )g g gg
new i Teacheri X X round MX rr = + × − + × (3.12)
Where 1 2, rr are random numbers between [0, 1].
Step 6 : Every learner improves grade point of each subject through the mutual interaction
with the other learners. Each learner interacts randomly with other learners and hence
facilitates knowledge sharing. For a given learner,( )giX another learner
( )grX is randomly
selected (i ≠r). The grade point of the thj subject of the thi learner is modified by
( ) ( )( )
( ) ( ) ( ) ( ) ( )
( )
( ) ( ) ( )
g g g g gi i r i r
gnew i
g g gi r i
X rand X X if X X
XX rand X X otherwise
+ × − < =
+ × − (3.13)
49
3.3 Modified Teaching Learning Based Optimization (MTLBO)
Algorithm
3.3.1 Modification of TLBO
Since the original TLBO is based on the principles of teaching-learning approach, we
can always draw analogy with the real class room or learning scenario while designing TLBO
algorithm. Although, a teacher always wishes that his / her student should achieve the
knowledge equal to him in fast possible time but at times it becomes difficult for a student due
to his / her forgetting characteristics. Teaching-learning process is an iterative process wherein
the continuous interaction takes place for the transfer of knowledge. Every time a teacher
interacts with a student he/she finds that the student is able to recall part of the lessons learnt
from the last session. This is mainly due to the physiological phenomena of neurons in the
brain. In this work, my motivation is to include a parameter [73, 74, 75 & 76] known as
“weight” in the equations (3.11) and (3.13) of original TLBO. In contrast to the original TLBO,
in our approach while computing the new learner value the part of its previous value is
considered and that is decided by a weight factor w. It is generally believed to be a good idea to
encourage the individuals to sample diverse zones of the search space during the early stages of
the search. During the later stages, it is important to adjust the movements of trial solutions
finely so that they can explore the interior of a relatively small space in which the suspected
global optimum lies. To meet this objective we reduce the value of the weight factor linearly
with time from a (predetermined) maximum to a (predetermined) minimum value:
max min
max( )w w
w gwG
−= − × (3.14)
Where max
w and min
w are the maximum and minimum values of weight factor w, g iteration
is the current iteration number and G is the maximum number of allowable iterations. max
w
and min
w are selected to be 0.9 and 0.1, respectively. Hence, in the teacher phase the new set
of improved learners can be
( )( )( ) Teacherg g gg
new i Fi w X rand X T MX = × + × − (3.15)
and a set of improved learners in learner phase as
50
( ) ( )( )
( ) ( ) ( ) ( ) ( )
( )
( ) ( ) ( )
new
g g g g gi i r i r
gi
g g gi r i
w X rand X X if X X
Xw X rand X X otherwise
× + × − < =
× + × − (3.16)
3.3.2 Advantages of MTLBO
• It is reliable, accurate and robust.
• Total computational time is less.
• Consistency is high.
• Proper parameter selection problem is not here.
• Proper crossover and mutation rate is not required.
• It is quite efficient as comparison to others .
• TLBO finds better or equal solutions much faster than others.
• It gives better performance with less computational time for the problems with
high dimensions.
3.3.3 Disadvantages of MTLBO
• It converges quickly if proper precaution is not taken.
3.4 Multi–Objective Teaching Learning Based Optimization
Algorithm
3.4.1 Multi - objective optimization: A brief overview
Multi-objective optimization [77] – [79] involves the simultaneous optimization of
several incommensurable and often competing objectives. In the absence of any preference
information, a non- dominated set of solution is obtained, instead of a single optimal solution.
These optimal solutions are termed as Pareto optimal solutions.
Let us consider a minimization problem
Minimize ( ) ( ) ( ) ( ){ }1 2, ,...... MX X X Xf f ff = (3.17)
Subject to constraints:
( ) 0 1, 2 ....,i X i mg = = (3.18)
51
( ) 0, 1, 2 ....j X j nh ≤ = (3.19)
where X is called the decision vector, M is the number of objectives ( 2M ≥ ) and m and n
are the number of equality and inequality constraints, respectively. Any solution vector a
dominates b , if and only if a is partially less than b ( pa b< ) i.e. { }1, 2,.....i M∀ ∈
( ) ( )i ia bf f≤ (3.20)
Those solutions, which are not dominated, by other solutions of a given set are considered
non-dominated, regarding that set. The front obtained by mapping these non-dominated
particles into objective space is called the Pareto-optimal set or the Pareto-optimal front, POF.
( ) ( ) ( ) ( ){ }1 2, ,......
MPOF X X X X X Sf f f f = = ∈ (3.21)
where S is the set of obtained non-dominated particles. The determination of complete
Pareto-optimal front is a very difficult task owing to the computational complexity involved
in its computation due to the presence of a large number of suboptimal Pareto fronts.
Considering the existing memory constraints, the determination of the complete Pareto front
becomes infeasible, and thus requires the solutions to be diverse covering maximum possible
regions of it.
3.4.2 Non-dominated sorting teaching learning based optimization
algorithm
The Multi – objective TLBO algorithm is a very new algorithm recently introduced
[80] – [82]. This optimization technique performs based on the dependency of the learners in
a class on the quality of teacher in the class. The teacher raises the average performance of the
class and shares the knowledge with the rest of the class. The individuals are free to perform
on their own and excel after the knowledge is shared. The whole procedure of TLBO is
divided in to two phases, the Teacher phase and the Learner phase.
3.4.2.1 Initialization
The population X is randomly initialized by a search space bounded by matrix of N
rows and D columns. The value N represents the number of learners in a class i.e. “class
size” or ‘‘population size ’’ in this case. The value D represents the number of “subjects or
courses offered” to the learners, which is same as the dimensionality of the problem taken.
52
The procedure being iterative is set to run for G maximum number of iterations
(generations).Thethj
parameter of the
thi learner (vector) in the initial generation is assigned
values randomly using the equation
( )1 min m ax min( , )i j j j jx x rand x x= + × − (3.22)
Where rand represents a uniformly distributed random variable within the range (0, 1).
minjx and max
jx represent the minimum and maximum value for thj
parameter. The
parameters of the thi learner (vector) for the generation g are given by
( , )( ) ( ,1) ( ,2) ( , ), , , , , i Dg g g g gi i i i jX x x x x = K K (3.23)
For all the equations used in the algorithm, 1,2,3,........,i N= , 1,2,3,........,j D= and
1,2,3,........,g G= . The random distribution followed by all the rand values is the uniform
distribution.
The objective values at a given generation form a column vector. In a dual objective
scenario, such as this one, two objective values are present for the same row vector. Two
objectives (a and b) can be evaluated as
( )( )
( )
( )
ggii
g gi i
fa XYa
Yb fb X
=
(3.24)
3.4.2.2 Teacher phase
The mean vector gM containing the mean of the learners in the class for each subject
at generation g is computed. The mean vector gM is given as
( )
( )
( )
(1,1) ( ,1) ( ,1)
(1, ) ( , ) ( , )
(1, ) ( , ) ( , )
, , , ,
, , , ,
, , , ,
Tg g g
i N
g g g gj i j N j
g g gD i D N D
mean x x x
M mean x x x
mean x x x
=
K K
K
K K
K
K K
(3. 25)
53
which effectively gives us
1 2, , , , ,g g g g g
j DM m m m m = K K (3.26)
The best vector with the minimum objective function value is taken as the teacher
(gTeacherX ) for respective iteration. The algorithm proceeds by shifting the mean of the learners
towards its teacher. A randomly weighted differential vector is formed from the current mean
and the desired mean vectors and added to the existing population of learners to get a new set
of improved learners.
( )( )( )g g gg
new i Teacher Fi X rand X T MX = + × − (3.27)
where FT is the teaching factor (either 1 or 2) which decides the value of mean to be changed
randomly at each iteration . The value of FT is decided randomly with equal probability as,
21F roundT r= + (3.28)
Where 2r is random number between [0, 1]. FT is not a parameter of the TLBO algorithm .
If the matrix ( )g
new iX is found to be a superior learners than ( )giX in generation g , than
it replaces inferior learners ( )giX using the non-dominated sorting algorithm.
3.4.2.3 Learner Phase
This phase consists of the interaction of learners with one another. The process of
mutual interaction tends to increase the knowledge of the learner. Each learner interacts
randomly with other learners and hence facilitates knowledge sharing. For a given learner,
( )giX another learner
( )grX is randomly selected (i ≠r). The
thi vector of the matrix ( )
gnew iX in
the learner phase is given as
( ) ( )( )
( ) ( ) ( ) ( ) ( )
( )
( ) ( ) ( )
g g g g gi i r i r
gnew i
g g gi r i
X rand X X if X X
XX rand X X otherwise
+ × − < =
+ × − (3.29)
54
The NSTLBO algorithm, due to the multi-objective requirements, adapts to the scenario by
having multiple ( )g
new iX matrices in the learner phase, one for each objective. So, the learner
phase operations for a dual objective problem are as shown in equations below.
( ) ( )( )
( ) ( ) ( )
( )
( ) ( ) ( )
g g g g gi i r i r
gnew i
g g gai r i
X rand X X if Xa Xa
XX rand X X otherwise
+ × − < =
+ × −
(3.30)
( ) ( )( )
( ) ( ) ( )
( )
( ) ( ) ( )
g g g g gi i r i r
gnew i
b g g gi r i
X rand X X if Xb Xb
XX rand X X otherwise
+ × − < =
+ × −
(3.31)
The ( )g
new ia
X matrix and the ( )g
new ib
X matrices are passed together to the non-dominated
sorting algorithm and only N best learners are selected for the next iteration.
3.4.2.4 Algorithm termination
The algorithm is terminated after G iterations are completed. The final set of learners
represents the pareto curve through their objective values.
3.4.2.5 Reducing pareto set
Normally, the pareto - optimal set can be extremely large in some problems. Reducing
the set of non-dominated solutions without destroying the characteristic of the Pareto front is
desirable from the decision maker's point of view. When the Pareto front set exceeds its limit
preset by the user, the smallest Crowding distance will be removed.
3.4.2.6 Determining best compromise solution
After getting pareto front, we have tried to extract the best compromise solution from
the pareto. A fuzzy – based mechanism, as proposed in [81], is applied to get the best
compromise solution, which can be offered to Decision Maker (DM) later. Membership value
of each individual lying in Pareto-optimal set iF is computed using the membership
function defined in the following way:
min
maxminmax
max min
max
1,
,
0,
ii
i ii ii i
i i
i i
if F F
F Fif FF F
F F
if F F
µ
≤
−= < <
− ≥
(3.32)
55
where iµ stands for the membership value of the thi objective function.
For each non-dominated solution k , the normalized membership function kµ is
calculated using
1
1 1
pareto
M k
ik i
M k
ik i
N
µµ
µ=
= =
=∑
∑ ∑ (3.33)
where, paretoN is the number of non-dominated solutions in Pareto – optimal front and M is the
number of objective functions.
The best compromise solution is that for which kµ is maximum.
3.5 Conclusions
The simulation of Teaching Learning Based Optimization process can be used to solve
the optimal placement and sizing problems of Distributed Generation units in various systems
in particular, it can be implemented for radial distribution systems. This method is capable of
addressing the modeling challenges required by the radial distribution systems.