research article drscro: a metaheuristic algorithm for...

21
Research Article DRSCRO: A Metaheuristic Algorithm for Task Scheduling on Heterogeneous Systems Yuyi Jiang, 1 Zhiqing Shao, 1 Yi Guo, 1,2 Huanhuan Zhang, 1 and Kun Niu 1 1 College of Information Science and Engineering, East China University of Science and Technology, Shanghai 200237, China 2 School of Information Science and Engineering, Shihezi University, Shihezi 832003, China Correspondence should be addressed to Zhiqing Shao; [email protected] Received 12 August 2015; Revised 13 November 2015; Accepted 23 November 2015 Academic Editor: Ching-Ter Chang Copyright © 2015 Yuyi Jiang et al. is is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. An efficient DAG task scheduling is crucial for leveraging the performance potential of a heterogeneous system and finding a schedule that minimizes the makespan (i.e., the total execution time) of a DAG is known to be NP-complete. A recently proposed metaheuristic method, Chemical Reaction Optimization (CRO), demonstrates its capability for solving NP-complete optimization problems. is paper develops an algorithm named Double-Reaction-Structured Chemical Reaction Optimization (DRSCRO) for DAG scheduling on heterogeneous systems, which modifies the conventional CRO framework and incorporates CRO with the variable neighborhood search (VNS) method. DRSCRO has two reaction phases for super molecule selection and global optimization, respectively. In the molecule selection phase, the CRO as a metaheuristic algorithm is adopted to obtain a super molecule for accelerating convergence. For promoting the intensification capability, in the global optimization phase, the VNS algorithm with a new processor selection model is used as the initialization under the consideration of scheduling order and processor assignment, and the load balance neighborhood structure of VNS is also utilized in the ineffective reaction operator. e experimental results verify the effectiveness and efficiency of DRSCRO in terms of makespan and convergence rate. 1. Introduction A large application can be decomposed into several smaller models (i.e., tasks) processed in parallel on heterogeneous computing systems. An efficient task scheduling is crucial for leveraging the performance potential of a heterogeneous system. e problem of the task scheduling on heterogeneous system can be stated as assigning the processors to the tasks for minimizing the makespan (i.e., the total execution time). As one task is required only aſter all of its predecessors are executed, these tasks with precedence constraints can be modeled as directed acyclic graphs (DAGs), where the nodes and the directed edges represent the tasks and the commu- nications between the tasks, respectively. Finding a schedule that minimizes the execution time of a parallel program is known to be NP-complete [1]. erefore, two scheduling strategies, heuristic and metaheuristic, are developed for searching a suboptimal solution with lower execution time. Heuristic scheduling strategies focus on identifying a solution by exploiting the heuristics, an important class of algorithms based on which is list scheduling [212], such as heterogeneous earliest finish time (HEFT) [3]. List schedul- ing consists of two basic phases, constructing a scheduling list of tasks order by priority of each task and mapping each task to a processor in priority order according to greedy approach (i.e., a task with the highest-priority is assigned to a processor that allows the earliest finish time). e performance of heuristic-based algorithms relied on the effectiveness of the heuristics in a tremendous manner. Metaheuristic scheduling strategies such as Ant Colony Optimization (ACO) [13], Genetic Algorithms (GA) [1421], Tabu Search (TS) [22, 23], and Simulated Annealing (SA) [24] search the solution spaces in a direct manner and produce consistent and high quality results on the wide range prob- lems while, in comparison with heuristic-based algorithms, these strategies always cost much more time. e Chemical Reaction Optimization (CRO) is a new metaheuristic method and has shown its efficiency in solving NP-complete problem [2529]. ere are only two CRO-based algorithms [27, 30] Hindawi Publishing Corporation Mathematical Problems in Engineering Volume 2015, Article ID 396582, 20 pages http://dx.doi.org/10.1155/2015/396582

Upload: others

Post on 27-Mar-2020

13 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Research Article DRSCRO: A Metaheuristic Algorithm for ...downloads.hindawi.com/journals/mpe/2015/396582.pdf · Research Article DRSCRO: A Metaheuristic Algorithm for Task Scheduling

Research ArticleDRSCRO A Metaheuristic Algorithm for Task Scheduling onHeterogeneous Systems

Yuyi Jiang1 Zhiqing Shao1 Yi Guo12 Huanhuan Zhang1 and Kun Niu1

1College of Information Science and Engineering East China University of Science and Technology Shanghai 200237 China2School of Information Science and Engineering Shihezi University Shihezi 832003 China

Correspondence should be addressed to Zhiqing Shao zshaoecusteducn

Received 12 August 2015 Revised 13 November 2015 Accepted 23 November 2015

Academic Editor Ching-Ter Chang

Copyright copy 2015 Yuyi Jiang et al This is an open access article distributed under the Creative Commons Attribution Licensewhich permits unrestricted use distribution and reproduction in any medium provided the original work is properly cited

An efficient DAG task scheduling is crucial for leveraging the performance potential of a heterogeneous system and finding aschedule that minimizes the makespan (ie the total execution time) of a DAG is known to be NP-complete A recently proposedmetaheuristic method Chemical Reaction Optimization (CRO) demonstrates its capability for solving NP-complete optimizationproblems This paper develops an algorithm named Double-Reaction-Structured Chemical Reaction Optimization (DRSCRO)for DAG scheduling on heterogeneous systems which modifies the conventional CRO framework and incorporates CRO withthe variable neighborhood search (VNS) method DRSCRO has two reaction phases for super molecule selection and globaloptimization respectively In the molecule selection phase the CRO as a metaheuristic algorithm is adopted to obtain a supermolecule for accelerating convergence For promoting the intensification capability in the global optimization phase the VNSalgorithm with a new processor selection model is used as the initialization under the consideration of scheduling order andprocessor assignment and the load balance neighborhood structure of VNS is also utilized in the ineffective reaction operatorThe experimental results verify the effectiveness and efficiency of DRSCRO in terms ofmakespan and convergence rate

1 Introduction

A large application can be decomposed into several smallermodels (ie tasks) processed in parallel on heterogeneouscomputing systems An efficient task scheduling is crucialfor leveraging the performance potential of a heterogeneoussystemThe problem of the task scheduling on heterogeneoussystem can be stated as assigning the processors to the tasksfor minimizing the makespan (ie the total execution time)As one task is required only after all of its predecessorsare executed these tasks with precedence constraints can bemodeled as directed acyclic graphs (DAGs) where the nodesand the directed edges represent the tasks and the commu-nications between the tasks respectively Finding a schedulethat minimizes the execution time of a parallel program isknown to be NP-complete [1] Therefore two schedulingstrategies heuristic and metaheuristic are developed forsearching a suboptimal solution with lower execution time

Heuristic scheduling strategies focus on identifying asolution by exploiting the heuristics an important class of

algorithms based on which is list scheduling [2ndash12] such asheterogeneous earliest finish time (HEFT) [3] List schedul-ing consists of two basic phases constructing a scheduling listof tasks order by priority of each task and mapping each taskto a processor in priority order according to greedy approach(ie a task with the highest-priority is assigned to a processorthat allows the earliest finish time) The performance ofheuristic-based algorithms relied on the effectiveness of theheuristics in a tremendous manner

Metaheuristic scheduling strategies such as Ant ColonyOptimization (ACO) [13] Genetic Algorithms (GA) [14ndash21]Tabu Search (TS) [22 23] and SimulatedAnnealing (SA) [24]search the solution spaces in a direct manner and produceconsistent and high quality results on the wide range prob-lems while in comparison with heuristic-based algorithmsthese strategies always cost much more time The ChemicalReactionOptimization (CRO) is a newmetaheuristicmethodand has shown its efficiency in solving NP-complete problem[25ndash29] There are only two CRO-based algorithms [27 30]

Hindawi Publishing CorporationMathematical Problems in EngineeringVolume 2015 Article ID 396582 20 pageshttpdxdoiorg1011552015396582

2 Mathematical Problems in Engineering

for DAG scheduling on heterogeneous system so far accord-ing to our knowledge These two algorithms both focusedon the DAG scheduling with the objective of minimizing themakespan However as metaheuristic scheduling strategiesCRO-based algorithms for DAG scheduling still have veryhigh time cost and the convergence rates of them also needto be improved In [30] the concept of super molecule isapplied for accelerating convergence and the super moleculeis selected by heuristic scheduling strategies However theperformance of this kind of super molecule selectionmethodis affected by the range of problems

This paper proposes an algorithm Double-Reaction-Structured CRO (DRSCRO) for DAG task scheduling onheterogeneous systems to aim at obtaining schedules withbetter quality In this paper the conventionalCRO frameworkscheme is modified and two reaction phases one for supermolecule selection and another for global optimization aredeveloped in DRSCRO CRO as a metaheuristic algorithmis utilized in the molecule selection phase to obtain a supermolecule [31] for better convergence rate And the variableneighborhood search (VNS) algorithm [32] method with anew processor selection model as well as its neighborhoodstructure is also utilized to promote the intensificationcapability in the global optimization phase

There are three major contributions of this work

(1) Developing DRSCRO by modifying the conven-tional CRO framework and utilizing a metaheuristicmethod to obtain a super molecule for acceleratingconvergence

(2) Utilizing the VNS [32] algorithm with a new pro-cessor selection model as the global optimizationphase initialization which takes into account theoptimization of the scheduling order and processorassignment and applying one of its neighborhoodstructures in the reaction operator to promote theintensification capability of DRSCRO

(3) Conducting simulation experiments to prove theefficiency and effectiveness of DRSCRO in terms ofmakespan and convergence rate

The next section introduces relevant research works onthe DAG scheduling problem on heterogeneous systemsSection 3 describes the models of the studied problem as for-mal statement Section 4 presents the design of the proposedDRSCRO for DAG scheduling In Section 5 the simulationperformance of DRSCRO is analyzed and compared withsome existing scheme algorithms Section 6 draws the con-clusions of this paper and the suggestions for future research

2 Literature Review

The DAG scheduling problem which has been proven to beNP-hard in general [1] can be formulated as the search foran optimal solution to the assignment of the tasks in DAGonto a set of processors to minimize the total schedulinglength (iemakespan)There are twomain categories heuris-tic (deterministic) and metaheuristic (nondeterministic) ofthe various scheduling algorithms proposed over the last

decade As metaheuristic methods CRO-based algorithmsfor DAG scheduling on heterogeneous systems are based onChemical Reaction Optimization (CRO) algorithm whichwas proposed very recently and has shown its power to dealwith NP-complete problems

21 Heuristic and Metaheuristic Methods The heuristicmethods are on the basis of the heuristics which are extractedfrom intuitions and the most important class of them is listscheduling algorithms [2ndash12] The HEFT algorithm whichwas proposed by Topcuoglu et al [3] utilizes the informationof execution cost on average of each task as an upward-ranking heuristic to calculate the task priority At each stepof HEFT the task with the highest value of upward rank isselected andmapped to the processor with a greedy approach(ie the assigned processor minimizes the earliest finishtime of the selected task) Experimental results prove thatHEFT obtains better performance on schedule quality andcomputational cost than the other list scheduling algorithmsThe performance of heuristic-based algorithms heavily reliedon the effectiveness of the heuristics The higher complexityDAG scheduling problems have the harder greedy heuristicsproduce consistent results on a wide range of problemsIn particular GA has been widely used to evolve solutionsfor many task scheduling problems as the most represen-tative metaheuristic method [21] Different from heuristic-based algorithms the metaheuristic methods use a guided-random-search-based process for solution searching Theytypically require sufficient sampling of candidate solutions inthe search space and have shown robust performance on avariety of scheduling problems For solving DAG schedulingproblem successfully many metaheuristic algorithms havebeen utilized such as GA [14ndash21] ACO [13] SA [24] TS [2223] CRO [27 30] VNS [21] and energy-efficient stochastic[33]

According to No-Free-Lunch Theorem [34] all well-designed metaheuristic methods have the same performanceon searching for optimal solutions when averaged over allpossible fitness functions In comparison with the heuristicmethods the metaheuristic methods which always havemuch higher computational cost can obtain better perfor-mance in terms of schedule quality because themetaheuristicmethods can search a wider area of the solution spacewith the guided-random-search-based processes for solutionsearching while the search of the heuristic-based algorithmsare narrowed down to a very smaller portion by means of theheuristics

22 CRO-Based Algorithms for DAG Scheduling on Hetero-geneous Systems CRO was proposed by Lam and Li veryrecently [25] and as far as we know as metaheuristic meth-ods Double Molecular Structure-Based Chemical ReactionOptimization (DMSCRO) [27] and Tuple-Based ChemicalReaction Optimization (TMSCRO) [30] are the only twoCRO-based algorithms for DAG scheduling on heteroge-neous systems CRO-based algorithms mimic the chemicalreaction process which accords with energy conservation ina closed container The molecules with two kinds of energypotential energy (PE) and kinetic energy (KE) in CRO-based

Mathematical Problems in Engineering 3

Table 1 Parameters used in CRO

Parameters DefinitionPE Current potential energy of a moleculeKE Current kinetic energy of a moleculeInitialKE Initial kinetic energy of a molecule120579 Threshold value guides the choice of on-wall collision or decomposition120599 Threshold value guides the choice of intermolecule collision or synthesisBuffer Initial energy in the central energy bufferKELossRate Loss rate of kinetic energyMoleColl Threshold value to determine whether to perform a unimolecule reaction or an intermolecule reactionPopSize Size of the moleculesiters Number of iterations

algorithms are the solutions to DAG scheduling problemThe PE value of a molecule is calculated by fitness functionwhich is equal to the objective value makespan of thecorresponding solution And KE is for helping the moleculeescape from local optimums and its value is nonnegativeA buffer is also used in CRO-based algorithms for energyinterchange and conservation Moreover to find the solutionwith the global minimal makespan four types of elementarychemical reactions on-wall ineffective collision decompo-sition intermolecular ineffective collision and synthesisare applied for the intensification and the diversificationsearches The typical execution flow of CRO frameworkadopted in DMSCRO and TMSCRO is as proposed in [25]and the parameters used in CRO are presented in Table 1

As metaheuristic methods DMSCRO and TMSCROhave better performance in terms of schedule quality thanheuristic methods and the reason is as presented in the lastparagraph of Section 21 The experimental results in [27 30]prove that both of DMSCRO and TMSCRO outperform GADMSCRO is the first algorithm by applying CRO proposedby Lam and Li in [25] to solve the DAG scheduling problemand it enjoys the advantages of both GA and SA On theone hand the intermolecular collision and on-wall collisiondesigned in DMSCRO have similar effect to the crossoveroperation and the mutation operation in GA respectivelyOn the other hand the energy conservation requirementin DMSCRO is able to guide the searching of the optimalsolution similarly to the way the Metropolis Algorithm of SAguides the evolution of the solutions in SA Two additionaloperations decomposition and synthesis give DMSCROmore opportunities to jump out of the local optimum andexplore the wider areas in the solution space This benefitenables DMSCRO to find good solutions faster than GAwhich has been widely used to evolve solutions for many taskscheduling problems DMSCRO are not compared with SAin [27 30] because the underlying principles and philoso-phies between DMSCRO and SA differ a lot [27] Typicallymetaheuristic algorithms like CRO-based algorithm of GA-based algorithms operating on a population of solutions areable to find good solutions faster than that operating on asingle solution like SA-based algorithms Comparing withDMSCRO TMSCRO applies constrained earliest finish time

algorithm to data pretreatment to take the advantage of thesuper molecule and constrained critical paths [35] whichis as heuristic information for accelerating convergenceMoreover the molecule structure and elementary reactionoperators design in TMSCRO are more reasonable thanthose in DMSCRO on intensification and diversification ofsearching the solution space

However for solving the NP problem of DAG schedul-ing on heterogeneous systems CRO-based algorithmsTMSCRO and DMSCRO still have very large time expen-diture as metaheuristic scheduling strategies therefore thesearching capabilities and convergence rates of them need tobe improved There are three deficiencies of TMSCRO andDMSCRO First in [30] the concept of super molecule isapplied for accelerating convergence and the super moleculeis selected by heuristic scheduling strategies but the perfor-mance of this kind of super molecule selection method isaffected by the range of problems Second in both TMSCROand DMSCRO the initial molecules which are very impor-tant for the whole searching process are randomly createdand the uncertainty of this kind of initialization underminesthe searching capabilities of TMSCRO and DMSCRO More-over the intensification capabilities of CRO-based algorithmsfor DAG scheduling also need to be improved to obtainbetter performances of the average results when the iterationstopping criterions are satisfied

Therefore this paper proposes an algorithm Double-Reaction-Structured CRO (DRSCRO) for DAG taskscheduling on heterogeneous systems to aim at obtainingschedules with better quality In this paper the conventionalCRO framework scheme is modified and two reactionphases one for super molecule selection and another forglobal optimization are developed in DRSCRO CRO as ametaheuristic algorithm is utilized in the molecule selectionphase to obtain a super molecule [31] for better convergencerate Moreover in the global optimization phase the variableneighborhood search (VNS) algorithm method [21 32 36]which is an effective metaheuristic with the utilizationsof neighborhood structures and a local search to changethe neighborhood systematically is used to optimize theinitial molecule and one of its neighborhood structuresis also adopted in the reaction operator to promote the

4 Mathematical Problems in Engineering

intensification capability And there is a newmodel proposedfor processor selection utilized in the neighborhoodstructures of the VNS algorithm for better effectiveness

Moreover in [21] VNS was incorporated with GA forDAG scheduling but the task priority was unchangeable inthe VNS algorithm in [21] which reduces the efficiency ofVNS to obtain a better solution So different from [21] topromote the intensification capability of the whole algorithmthe VNS in DRSCRO is modified under the consideration ofthe optimization of the scheduling order and the processorassignment both

3 Problem Formulation

The DAG scheduling problem is typically with two inputsa heterogeneous system for task computing in parallel anda parallel program of application (ie DAG) In this paperthe heterogeneous system is assumed as a static computingsystem model presented by 119875 = 119901

119894| 119894 = 1 2 3 |119875|

which is a fully connected network of processorsThe hetero-geneity level in this paper is formulated as (1+hl)(1minushl)where the parameter hl isin (0 1) In this paper EcCost

119901119895(V119894)

represents the computation cost of a task V119894mapped to the

processor 119901119895and the value of each EcCost

119901119895(V119894) is randomly

chosen within the scope of [1 minus hl 1 + hl]In general DAG = (119881 119864) consists of a task (node) set

119881 and an edge set 119864 EcCost119901119895

(V119894) is as defined in the first

paragraph of this section and the same processor executesa task in the DAG without preemption The constraintbetween tasks V

119894and V

119895is denoted as the edge 119890

119894119895(119890119894119895

isin

119864) which means that the execution of task V119895only after

the execution result of task V119894has been transmitted to task

V119895 Each edge 119890

119894119895has a nonnegative weight comm(V

119894 V119895)

denoting the communication cost between V119894and V

119895 Each

task in a DAG can only be executed on one processor andthe communication can be performed simultaneously bythe processors In addition when two communicating tasksare mapped to the same processor the communication costof them is zero Predecessor (V

119894) represents the set of the

predecessors of V119894 while successor (V

119894) represents the set of

the successors of V119894 The task with no predecessor is denoted

as Ventry while the task with no successor is denoted as VexitConsider that there is a DAGwith |119881| tasks to be mapped

to a heterogeneous system with |119875| processors Assuming thehighest-priority ready task V

119894on the processor 119901

119895 the earliest

start time of V119894 119879ESTime(V119894 119901119895) can be formulated as

119879ESTime (V119894 119901119895) = max 119879avail (119901

119895) 119879ready (V

119894 119901119895) (1)

where 119879avail(119901119895) can be defined as (2) 119879avail(119901119895) is the timewhen processor 119901

119895is available to the execution of the task V

119894

119879avail (119901119895) = max

V119896isinexec(119901119895)119879AFTime (V

119896) (2)

where exec(119901119895) represents all the tasks which have already

been scheduled on the processor 119901119895and 119879AFTime(V119896) denotes

the actual finish time when the task V119896finishes its execution

119879ready(V119894 119901119895) in (1) represents the time when all the dataneeded for the process of V

119896have been transmitted to 119901

119895

which is formulated as

119879ready (V119894 119901119895)

= maxV119896isinpredecessor(V119894)

119879AFTime (V119896) + comm (V

119896 V119894)

(3)

where 119879AFTime(V119896) has the same definition in (2) and prede-cessor (V

119894) denotes the set of all the immediate predecessors of

task V119894 comm(V

119896 V119894) is 0 if the task V

119896and task V

119894are mapped

to the same processor 119901119895

If task V119894is mapped to the processor 119901

119895with nonpreemp-

tive processing approach the earliest finish time of task V119894

119879EFTime(V119894 119901119895) is formulated as

119879EFTime (V119894 119901119895) = 119879ESTime (V

119894 119901119895) + EcCost

119901119895(V119894) (4)

After the task V119894is executed by the processor 119901

119895

119879EFTime(V119894 119901119895) is assigned to 119879AFTime(V119894) Themakespan of theentire parallel program is equivalent to the actual finish timeof exit task Vexit

119898119886119896119890119904119901119886119899 = maxV119894isin119881

119879AFTime (V119894) = 119879AFTime (Vexit) (5)

The computation of the communication-to-computationratio (CCR) can be formulated as in

CCR =

sum119890119894119895isin119864

comm (V119894 V119895)

sumV119894isin119881119882 (V119894)

(6)

where 119882(V119894) is the average computation cost of task V

119894and it

can be calculated as follows

119882 (V119894) =

|119875|

sum

119896=1

EcCost119901119896

(V119894)

|119875| (7)

A simple four-task DAG and a heterogeneous computa-tion system with three processors are shown in Figures 1(a)and 1(b) respectively The definition of the notations can befound in Table 2

4 Design of DRSCRO

DRSCRO imitates the molecular interactions in chemicalreactions based on the concepts of atoms molecule molec-ular structure and energy of a molecule In DRSCRO amolecule corresponds to a scheduling solution in DAGscheduling with a unique molecular structure representingthe atom positions in a molecule We utilize the molecularstructure of TMSCRO in our work under the considerationof its capability to represent the constrained relationshipbetween the tasks in a molecule (solution) In addition theenergy of each molecule corresponds to the fitness valueof a solution The molecular interactions try to reconstructmore stable molecular structure with lower energy There arefour kinds of basic chemical reactions on-wall ineffective

Mathematical Problems in Engineering 5

Table 2 Definitions of notations

Notations Definitions

DAG = (119881 119864) Input directed acyclic graph with |119881| nodes representing tasks and |119864| edgesrepresenting constrained relations among the tasks

119875 = 119901119894| 119894 = 1 2 3 |119875| Set of heterogeneous processors in target system

EcCost119901119895(V119894) Execution cost of task V

119894using processor 119901

119895

comm(V119896 V119894) Communication cost from task V

119896to task V

119894

119879ESTime(V119894 119901119895) The earliest start time of task V119894which is mapped to processor 119901

119895

119879EFTime(V119894 119901119895) The earliest finish time of task V119894which is mapped to processor 119901

119895

119879avail(119901119895) The time when processor 119901119895is available

119879ready(V119894 119901119895) The time when all the data needed for the process of V119894have been transmitted to 119901

119895

119879AFTime(V119896) Actual finish time when task V119896finishes its execution

predecessor(V119894) Set of the predecessors of task V

119894

successor(V119894) Set of the successors of task V

119894

exec(119901119895) Set of the tasks which have already been scheduled on the processor 119901

119895

119882(V) Average computation cost of task VCCR Communication-to-computation ratiohl Parameter for adjusting the heterogeneity level in a heterogeneous system

Start

End

516

1714

0

0

1(8)

4(10)

3(15)

2(14)

(a)

p2 p3

p1

(b)

Figure 1 (a) DAG model (b) Heterogeneous computation system model

collision decomposition intermolecular ineffective collisionand synthesis for molecular interactions in DRSCRO andeach kind of reaction contains two subclasses These twosubclasses of reaction operators are applied in the phase ofsuper molecule selection and the phase of global optimiza-tion respectively

41 Framework of DRSCRO The framework of DRSCRO toschedule a DAG job is as shown in Figure 2 with two basicphases the phase of supermolecule selection and the phase ofglobal optimization In each phase DRSCRO first initializes

the process of a phase and then the phase process entersiteration

In this framework DRSCRO first executes the phase ofsuper molecule selection to obtain the super molecule (iejust themolecule with the global minimalmakespan) SMolewith other output molecules as the input of the next globaloptimization phase for the first time (the input of VNSalgorithm is the population with SMole after each iterationin the global optimization phase in the other times) andthen DRSCRO performs the phase of global optimization toapproach the global optimum solution The VNS algorithm

6 Mathematical Problems in Engineering

The global min pointYes

Intermolecularcollision

Molecule selection (one is chosen)

Molecule selection (two or more are chosen)

Satisfy the criteria of decomposition for super

molecule

Satisfy the criteria

molecule selection

OnWallSMS DecompSMS IntermoleSMS SynthSMS

Fit evaluation

No Yes

YesNo No Yes

No

Start

Initialization of the reaction for super molecule selection

Next phase criteria matched

Stopping criteria matched

Initialization of the reaction for globaloptimalization (InitMoleGOVNS)

Yes

Intermolecularcollision

Molecule selection (oneis chosen)

Molecule selection (two or more are chosen)

Satisfy the criteria of decomposition for global

optimalization

Satisfy the criteria

OnWallGO DecompGO IntermoleGO SynthGO

Fit evaluation

No Yes

YesNo No Yes

No

selection

of synthesis for super

of synthesis for globaloptimalization

Figure 2 Framework of DRSCRO

Mathematical Problems in Engineering 7

(1)makespan = 0(2) for each node V

119896inm = ((V

1 1198911 1199011) (V2 1198912 1199012) (V

|119881| 119891|119881| 119901|119881|)) do

(3) calculate the actual finish time of V119896(ie 119879AFTime(V119896))

(4) if 119898119886119896119890119904119901119886119899 lt 119879AFTime(V119896)(5) updatemakespan(6) makespan = 119879AFTime(V119896)(7) end if(8) end for(9) returnmakespan

Algorithm 1 Fit(m) calculating the fitness value of a molecule and the processor allocation optimization

with a new model for processor selection is adopted asthe initialization of the global optimization phase and itis also utilized as a local search process to promote theintensification capability of DRSCROThere are four kinds ofelementary chemical reaction in DRSCRO on-wall collisiondecomposition intermolecular collision and synthesis Andeach kind of reaction contains two types of operators whichare respectively utilized in two phases of DRSCRO In eachiteration one of the elementary chemical reaction operatorsis performed to generate new molecules and the PEs of thenewly generated molecules (ie the fitness function valuesof the newly generated molecules) will be calculated Inaddition SMole will be tracked and only participates inon-wall ineffective collision and intermolecular ineffectivecollision in the global optimization phase to explore as muchas possible the solution space in its neighborhoods and themain purpose is to prevent the supermolecule from changingdramatically The iteration of each phase repeats until thestopping criteria (or next phase criteria) are met and SMoleand its fitness function value are just the final solutionand makespan (ie global min point) respectively In theimplementations of the experiments in this paper the nextphase criteria and the stop criteria ofDRSCROare set aswhenthere is no makespan improvement after 10000 consecutiveiterations in the search loop

42 Molecular Structure and Fitness Function This subsec-tion presents the encoding of scheduling solutions (ie themolecular structure) and the statement of the fitness functionin DRSCRO

421 Molecular Structure In this paper an atom with threeelements can be denoted as a tuple (V

119894 119891119894 119901119894) and the molec-

ular structure M with an array of tuples can be formulatedas in (8) to represent a solution to the DAG schedulingproblem The order of the tuples in M represents the priorityof each DAG task V

119894with the allocated processor 119901

119894 and V =

(V1 V2 V

|119881|) is a topological sequence of DAG which is

with the hypothetical entry task (with no predecessors) V1and

exit task (with no successors) V|119881| respectively representing

the beginning and end of execution Moreover if tuple A isbefore tuple B and VA is the predecessor of VB in DAG thesecond integer of tuple B 119891B will be 1 and vice versa

m = ((V1 1198911 1199011) (V2 1198912 1199012) (V

|119881| 119891|119881|

119901|119881|

)) (8)

422 Fitness Function Potential energy (PE) is defined asthe fitness function value of the corresponding solutionrepresented by 119878 The overall schedule length of the entireDAG namely makespan is the largest finish time amongall tasks which is equivalent to the actual finish time ofthe exit node in DAG In this paper the goal of DAGscheduling problem by DRSCRO is to obtain the schedulingthat minimizes makespan and ensure that the precedence ofthe tasks is not violated Hence each fitness function value isdefined as

Fit (m) = PEm = 119898119886119896119890119904119901119886119899 (9)

Algorithm 1 presents how to calculate the value of theoptimization fitness function Fit(m)

43 Super Molecule Selection Phase

431 Initialization There are two kinds of initial moleculegenerator one used in the phase of super molecule selectionand the other used in the phase of global optimization togenerate the initial solutions for DRSCRO tomanipulateThetuples of the first moleculem used in the initialization of thephase of super molecule selection are ascendingly orderedby the upward rank value [27] of their V

119894 and element three

119901119894of each tuple is generated by a random perturbation The

upward rank value can be calculated by

Rank (V119894)

= 119882 (V119894)

+ maxV119895isinsuccessor(V119894)

comm (V119894 V119895) + Rank (V

119895)

(10)

A detailed description of the initial molecule generator ofthe super molecule selection phase is given in Algorithm 2For the first input molecule m 119901

119909in each tuple inm is set as

1199011

432 Elementary Chemical Reaction Operators InDRSCROthe operators for super molecule selection just randomlychange 119901

119894of each tuple in a molecule as the intensification

searches or the diversification searches [25] to optimize theprocessor mapping of a solution Figures 3 4 5 and 6respectively show the examples of four operators for super

8 Mathematical Problems in Engineering

(1) 119872119900119897119890119873 = 1(2) while MoleN le PopSize do(3) for each 119901

119894in moleculem to randomly change

(4) change 119901119894randomly

(5) end for(6) generate a new moleculem1015840(7) MoleN =MoleN + 1(8) end while

Algorithm 2 InitMoleSMS(m) generating the initial populationfor the super molecule selection phase

Molecule

New molecule

(1 0 p1)

(1 0 p1)

(2 1 p2)

(2 1 p2)

(4 0 p2) (3 1 p3)

(4 0 p3) (3 1 p1)

Figure 3 Example of OnWallSMS

New molecule 1

New molecule 2

Molecule

(1 0 p3) (2 1 p2)

(4 0 p3)

(4 0 p1) (3 1 p1)

(3 1 p1)(2 1 p2)(1 0 p1)

(1 0 p1) (2 1 p3) (4 0 p3) (3 1 p2)

Figure 4 Example of DecompSMS

molecule selection in which the molecules correspond to theDAG as shown in Figure 1(a) And the white blocks in theseexamples denote the tuples that do not change during thereaction operation calculations

As shown in Figure 3 the operatorOnWallSMS is used togenerate a new molecule m1015840 from a given reaction moleculem for optimization OnWallSMS works as follows (1) Theoperator randomly chooses a tuple (V

119894 119891119894 119901119894) in m (2)

The operator changes 119901119894randomly In the end the operator

New molecule 1

New molecule 2

Molecule 1

Molecule 2

(1 0 p1)

(1 0 p1)

(1 0 p1)

(1 0 p1)

(2 1 p1)

(2 1 p2)

(2 1 p2)

(2 1 p1)

(3 1 p1)

(3 1 p3)

(4 0 p3)

(4 0 p3)

(4 0 p3)

(4 0 p1)

(3 1 p2)

(3 1 p3)

Figure 5 Example of IntermoleSMS

Molecule 1

Molecule 2

New molecule

(4 0 p3)

(4 0 p3)

(2 1 p1)

(2 1 p2)(1 0 p2)

(1 0 p2)

(1 0 p1) (2 1 p3)

(3 1 p3)

(3 1 p1)

(4 0 p3) (3 1 p1)

Figure 6 Example of SynthSMS

generates a new molecule m1015840 from m as an intensificationsearch

As shown in Figure 4 the operator DecompSMS is usedto generate newmoleculesm1015840

1andm1015840

2from a given reaction

moleculem DecompSMS works as follows (1) The operatorgenerates two molecules m1015840

1= m and m1015840

2= m (2)

The operator keeps the tuples in m10158401 which is at the odd

position in m and then changes the remaining 119901119909rsquos of tuples

inm10158401 randomly (3) The operator retains the tuples in m1015840

2

which is at the even position in m and then changes theremaining 119901

119909rsquos of tuples in m1015840

2randomly In the end the

operator generates two new molecules m10158401and m1015840

2from m

as a diversification search

Mathematical Problems in Engineering 9

(1) tempSet = pop set(2) pop subset = 0(3) if pop set is the input of the VNS algorithm for the first time (ie the output of the super molecule selection phase)(4) for each tempS in tempSet except SMole(5) choose a tuple (V

119894 119891119894 119901119894) in tempS where 119891

119894= 0 randomly

(6) generate a random number 119903119899119889 isin (0 1)(7) if rndge 05(8) find the first predecessor vj = Pred(vi) from vi to the begin in molecule tempS(9) interchanged position of (vi f i pi) and (V

119895+1 119891119895+1

119901119895+1

) in molecule tempS(10) update f i 119891119894+1 and 119891

119895+1as defined in the last paragraph of Section 421

(11) end if(12) for each pi in molecule tempS to randomly change(13) change pi randomly(14) end for(15) if Fit(tempS) lt Fit(SMole)(16) SMole = tempS(17) end if(18) end for(19) end if(20) pop subset adds SMole(21) pop subset adds the molecules in tempSet with tuple order different from SMole(22) while |119901119900119901 119904119906119887119904119890119905| = 119901119900119901 119904119906119887119904119890119905 119899119906119898 do(23) pop subset add a molecules in pop set which do not exist in pop subset(24) end while

Algorithm 3 InitVNS(pop set pop subset num) initializing the subset of the population pop set for undergoing the VNS algorithm

As shown in Figure 5 the operator IntermoleSMS isused to generate new molecules m1015840

1and m1015840

2from given

molecules m1and m

2 This operator first uses the steps in

OnWallSMS to generatem10158401fromm

1 and then the operator

generates the other newmoleculem10158402fromm

2in the similar

fashion In the end the operator generates two newmoleculesm10158401andm1015840

2from m

1and m

2as an intensification search

As shown in Figure 6 the operator SynthSMS is usedto generate a new molecule m1015840 from given molecules m

1

and m2 SynthSMS works as follows The operator keeps the

tuples inm1015840 which is at the same position inm1andm

2with

the same 119901119909rsquos and then changes the remaining 119901

119910rsquos in m1015840

randomly As a result the operator generatesm1015840 fromm1and

m2as a diversification search

44 Global Optimization Phase

441 Initialization VNS is utilized by our proposed algo-rithm as the initialization of the global optimization phaseand it is also as a local search process to promote theintensification capability of DRSCRO during the running ofthe whole algorithm

Algorithms 3 and 4 respectively present the subset gener-ator of the phase output and the main steps of the whole VNSalgorithm (ie the initialization of the global optimizationphase) In DRSCRO the VNS algorithm only processes thesubset of the population with the super molecule SMoleafter each iteration in the global optimization phase (theoutput of super molecule selection phase is the input of VNSfor the first time) As presented in Algorithm 3 if the pop set(ie the set of population) is the output of the super molecule

selection phase the tuple orders and 119901119894s of its elements

will be adjusted pop subset is the subset of population andpop subset num is the number of the elements in pop subsetwhich is set as PopSize times 50 in this paper

In Algorithm 4 different from the VNS proposed in [21]the task priority was changeable in theVNS algorithmused inDRSCRO the reason for which is that the unchangeable taskpriority in the VNS reduces its efficiency to obtain a bettersolutionTherefore under the consideration of the optimiza-tion of the scheduling order and the processor assignmentboth the input molecules of VNS can be with differenttuple order (ie task priority) as presented in Algorithm 3in each iteration 119889max is set to 2 as presented in [37] Asthe essential factor of VNS two neighborhood structuresload balance and communication reduction neighborhoodstructures which demonstrate their power in solving DAGscheduling problem on heterogeneous systems as presentedin [21] are adopted by the VNS algorithm in DRSCRO fortheir high efficiency In this paper a new model is alsoproposed for processor selection of these two neighborhoodstructures As presented in [21] there are two intuitionsused to construct the neighborhood structures One isthat balancing load among various processors usually helpsminimizing the makespan especially when most tasks areallocated to only a few processors the other is that reducingcommunication overhead and idle waiting time of processorsalways results in a more effective schedule especially givena relatively high unit communication However there is acontradiction between these two intuitions because reducingcommunication overhead and idle waiting time of processorsalways means that some processors are with most tasks

10 Mathematical Problems in Engineering

(1) pop subset = InitVNS(pop set)(2) Select the set of neighborhood structures 119873119890119894119892ℎ

119889(119889 = 1 2 3 119889max)

(3) for each individualm in the pop subset do(4) d = 1(5) while 119889 lt 119889max do(6) Randomly generate a moleculem

1from the 119889th neighborhood ofm

(7) Apply some local search method withm1as the initial molecule (the local optimum presented bym

2)

(8) If m2is better thanm

(9) m = m2

(10) 119889 = 1(11) else(12) d = d + 1(13) end if(14) end while(15) until the termination condition is satisfied(16) end for(17) execute the combination strategy

Algorithm 4 InitMoleGOVNS(pop set) generating the initial population of the global optimization phase

(1) for each processor 119901119894in the solution 120596 do

(2) Compute Tend(pi)(3) end for(4) Choose the processor 119901max with the largest Tend(119901max)(5) Randomly choose a task Vrandom from exec(119901max)(6) Randomly choose a processor 119901random different from 119901max(7) Reallocate Vrandom to the processor 119901random(8) Encode and reschedule the changed solution 1205961015840(9) return 1205961015840

Algorithm 5 GenNeighborhoodofLoadBalance(120596)

So different from the original ones in [21] we develop anew model for processor selection Let TCload(119901

119894) be all

the task execution cost of processor 119901119894 and TCcomm(119901

119894)

is the communication cost overhead of processor 119901119894as

defined in [21] The values of TCload(119901119894) and TCcomm(119901

119894)

are the tendencies of load balancing and communicationreducing respectively (ie the tendency of task reducing orincreasing) The greater TCload(119901

119894) is the stronger tendency

of reducing tasks on 119901119894is and the greater TCcomm(119901

119894) is the

stronger tendency of increasing tasks on 119901119894is Therefore a

parameter Tend(119901119894) is developed to measure the tendency

with the combination of TCload(119901119894) and TCcomm(119901

119894) as (11)

The neighborhood structure computation processes of loadbalance and communication reduction are as presented inAlgorithms 5 and 6 respectively The proposed model isunder the comprehensive consideration of both intuitionsand can make the VNS algorithm more effective than theoriginal one

Tend (119901119894) =

1

1 + 119890minus(TCload(119901119894)TCcomm(119901119894)) (11)

The VNS algorithm in DRSCRO utilizes the dual termi-nation criteria The termination criterion 1 sets the upperbound of the local search iterations to 20 and the termination

criterion 2 sets the maximum iteration number withoutimprovement to 3 The VNS algorithm will stop if eithercriterion is satisfied To form a new initial population acombination strategy is utilized for combining the currentpopulation and the VNS output after the VNS algorithmoutputs the subset of the population The current populationand the VNS output are first merged and sorted by increasingmakespan then the first PopSize molecules are selected togenerate the new initial population

442 Elementary Chemical Reaction Operators The opera-tors for global optimization not only vary 119901

119894of each tuple

but also interchange the positions of the tuples in a moleculeas the intensification searches or the diversification searches[25] to optimize the whole solution

On-wall ineffective collision (as an intensificationsearch) decomposition (as a diversification search) andsynthesis (as a diversification search) are as presented in [30]and we do not repeat them here to focus on our main workIn [30] the function of the ineffective collision operator issimilar to that of the on-wall ineffective collision operatorTherefore different from [30] amodified ineffective collisionoperator is proposed in this paper and it utilized the loadbalance neighborhood structure used in the VNS mentioned

Mathematical Problems in Engineering 11

(1) for each processor pi in the solution 120596 do(2) Compute Tend(pi)(3) end for(4) Choose the processor 119901min with the smallest Tend(119901min)(5) Set the candidate set cand empty(6) for each task vi in the set exec(119901min) do(7) Compute the set predecessor(vi)(8) Update predecessor(vi) with predecessor(vi) = predecessor(vi) minus exec(119901min)(9) cand = cand + predecessor(vi)(10) end for(11) Randomly choose a task Vrandom from cand(12) Reallocate Vrandom to the processor 119901min(13) Encode and reschedule the changed solution 1205961015840(14) return 1205961015840

Algorithm 6 GenNeighborhoodofCommReduction(120596)

(1) choose randomly a tuple (vi f i pi) inm1where f i = 0

(2) exchange the positions of (vi f i pi) and (V119894minus1119891119894minus1 119901119894minus1)

(3) modify 119891119894minus1 f i and 119891

119894+1inm1as defined in the last paragraph of Section 421

(4) generate a new moleculem10158401= GenLBNeighborhood(m

1)

(5) choose randomly a tuple (vi f i pi) inm2where f j = 0

(6) exchange the positions of (vi f i pi) and (V119895minus1

119891119895minus1

119901119895minus1

)(7) modify 119891

119895minus1 f j and 119891

119895+1inm2as defined in the last paragraph of Section 421

(8) generate a new moleculem10158402= GenLBNeighborhood(m

2)

Algorithm 7 IntermoleGO(m1m2)

Table 3 Execution cost of DAG tasks by each processor

Tasks 1199011

1199012

1199013

v1

7 8 9v2

12 14 16v3

14 15 16v4

13 3 14

before to promote the intensification capability of DRSCROand avoid the function duplication

The operator IntermoleGO (ie the ineffective collisionoperator) is used to generate new molecules m1015840

1and m1015840

2

from given moleculesm1andm

2 This operator first uses the

steps in OnWallGO to generate m10158401from m

1 and then the

operator generate the other newmoleculem10158402fromm

2in the

similar fashion In the end the operator generates two newmoleculesm1015840

1andm1015840

2fromm

1andm

2as an intensification

searchThe detailed executions are presented in Algorithm 7Figure 7 shows the example of the IntermoleGO in which themolecules correspond to the DAG as shown in Figure 1(a)

45 Illustrative Example Consider the example shown inFigure 1(a) Its edges are labeled with the communicationcosts whereas the execution costs are shown in Table 3

Initially the path (V1 V2 V4 V3) is found based on

the upward rank value of each task in DAG and the firstmolecule m = ((V

1 0 119901

1) (V2 1 1199011) (V4 0 119901

1) and (V

3

1 1199011)) can be obtained Algorithm 2 InitMoleSMS is then

executed to generate the initial population with 10 elements(ie PopSize is set as 10) for super molecule selection phaseThe initial population molecules are operated during theiterations in the super molecule selection phase as presentedin the framework of DRSCRO in Section 41 and the supermolecule SMole = ((V

1 0 1199013) (V2 1 1199011) (V4 0 1199012) and (V

3

1 1199011)) can be obtainedIn the global optimization phase Algorithm 4 InitMole-

GOVNS is then executed to generate (or to update) the initialpopulation after each iteration as presented in Section 41The molecules are operated during the iterations in theglobal optimization phase as presented in the framework ofDRSCRO in Section 41 and the global minimalmakespan =40 is finally obtained for which the corresponding solution(ie molecule) is ((V

1 0 1199012) (V4 1 1199012) (V2 0 1199012) and (V

3 1

1199012))

46 Analysis of DRSCRO As a new metaheuristic strategythe CRO-based methods for DAG scheduling which isproposed very recently have demonstrated the capability forsolving this kind of NP-hard optimization problems By ana-lyzing the framework molecular structure chemical reactionoperators and the operational environment in DRSCRO itcan be shown to some extent that DRSCRO scheme has theadvantage of three points in comparison with other CRO-based algorithms for DAG scheduling

12 Mathematical Problems in Engineering

New molecule 1

New molecule 2

Molecule 1

Molecule 2

(1 0 p2) (4 1 p2) (2 0 p3) (3 0 p2)

(3 0 p2)(4 1 p2)(2 1 p2)

(1 0 p1)

(1 0 p2)

(1 0 p1) (4 1 p1) (2 0 p1)

(4 0 p2)(2 1 p2)

(3 1 p1)

(3 0 p1)

Figure 7 Example of IntermoleGO

First to some degree super molecule in DRSCRO issimilar to InitS in TMSCRO [30] or the ldquoeliterdquo in GA[31] However the ldquoeliterdquo in GA is usually generated fromtwo chromosomes while super molecule is approached byexecuting the first phase of DRSCRO Moreover in compari-son with TMSCRO DRSCRO uses a metaheuristic strategy(CRO) to get a better super molecule It is because asintelligent random-search algorithm CRO used in the phaseof DRSCRO for super molecule selection searches a widerarea of the solution space than CEFT applied in TMSCROwhich narrow the search down to a very small portion ofthe solution space As a result a better super molecule maycontribute to a better global optimum solution and accel-erates convergence Second DAG scheduling problem hastwo complex aspects including task sequence optimizationand processor assignment optimization which lead to avery large and complicated solution space So for a bettercapability of intensification search than other CRO-basedalgorithms for DAG scheduling on heterogeneous systemsDRSCRO applied VNS algorithm as the initialization of theglobal optimization phase which is also as a local searchprocess during the running of DRSCRO and one of theneighborhood structures of VNS is also utilized in theineffective reaction operator Moreover during the runningof DRSCRO the task priority is changeable in our adoptedVNS algorithm and a new model for processor selection isalso utilized in the neighborhood structures for promotingefficiency of VNS different from the VNS proposed in [22]All of three advantages as previously mentioned enhance theability to get better rapidity of convergence and better searchresult in the whole solution space which is demonstrated bythe experimental results in Sections 53 and 54 The time

complexity of DRSCRO is 119874(iter times (2 times |119881| + 4 times |119864| times |119875| +

119889max times 119904119906119887119901119900119901119873119906119898))

5 Experimental Details

In this section the simulation experiment and comparativeevaluation of HEFT DMSCRO TMSCRO and proposedDRSCRO are presented As presented in [27 30] by theoryanalysis and experimental results TMSCRO and DMSCROproved to have better performance than GA therefore ourwork is the further study of CRO-based algorithms for DAGscheduling on heterogeneous systems and for DRSCRO asa metaheuristic algorithm we focus on the performance ofour proposed algorithm itself and the comparison betweenDRSCRO and other similar kinds of algorithms

First two extensive sets of graphs as the test beds forcomparative study are described Next the parameter settingswhich are used in the simulation experiments are pre-sented The results and analysis of the experiment includingmakespan test and convergence rate test are given in the finalpart

51 Test Bed As presented in [27 30] two extensive setsof DAGs real-world application and randomly generatedapplication graphs are considered as the test beds in theexperiments to enhance the comparability of various algo-rithms The first extensive test bed is two real-world problemDAGs molecular dynamics code [38] and Gaussian elimi-nation [8] Molecular dynamics are a computer simulationof physical movements of the molecules and atoms whichare allowed to interact for a period of time in the contextof N-body simulation Molecular dynamics code DAG isshown in Figure 8 Gaussian elimination is used to calculatethe solution for a linear equation system which is appliedsystematically to convert row operations on a set of linearequations to the upper triangular form As shown in Figure 9the total number of tasks in the Gaussian elimination DAGwith the matrix size of 7 is 27 and the largest task numberat the same level is 6 The reason of the utilization of thesetwo application graphs as a test bed is not only to enhancethe comparability of various algorithms but also to showthe function application of our proposed algorithm as anillustrative demonstration without loss of generality Thesecond extensive test bed for comparative study is the DAGsof random graphs A random graph generator presentedin [39] is implemented to generate random graphs in thesimulation experiment It allows the user to generate a varietyof random graphs with different characteristics such as CCRthe amount of calculation of a task the successor numberof a task and the total number of tasks in a random graphIt is also assumed that all tasks and communication linkshave the same computation cost and communication costrespectively

As shown in Figure 2 the next phase criteria and stoppingcriteria of DRSCRO are that the makespan stays unchangedfor 5000 consecutive iterations in the search loop And thestopping criterion of TMSCRO and DMSCRO is that themakespan remains the same for 10000 consecutive iterations

Mathematical Problems in Engineering 13

Entry

Exit

40

38

36

39

37

3433

27 28

2120

35

31 323029

22 23 24 25 26

321

4 5 6 7

131211 14

8 9 10

15 16 17 18 19

0

Figure 8 A molecular dynamics code DAG

52 Parameter Setting In the experiments a parameter hlis set to represent the heterogeneity level as presented inthe first paragraph of Section 3 It complies with the MHMmodel assumption and results in the fact that speeds of acomputing processor are different for different tasks In doingso the heterogeneity level (1 + hl)(1 minus hl) is equal to thebiggest possible ratio of the best processor speed to the worstprocessor speed for each task hl is set as the value tomake theheterogeneity level 2 unless otherwise specified in this paper

The details of parameter setting are shown in Table 4Theparameters 6ndash12 which are the CRO-based algorithms testedin the simulation are set as presented in [25]

53 Makespan Tests The performance of the proposed algo-rithm is compared with two state-of-the-art CRO-basedscheduling algorithms DMSCRO and TMSCRO and aheuristic algorithm HEFT Each makespan value plotted inthe graphs is the average value of a number of independentruns In the first extensive test bed themakespan is averagedover 10 independent runs (HEFT is run only once as adeterministic algorithm) while in the second extensive test

Exit

0

1 2 3 4 5 6

7

8

13

14

18

19

22

23

25

26

24

20 21

15 16 17

9 10 11 12

Entry

Figure 9 A Gaussian elimination DAG for a matrix of size 7

Table 4 Parameter values for simulation experiment

Parameter ValueCCR 01 02 1 2 5Number of processors 4 8 16 32

hl 0333Successor number of a task in a random graph 1 2 3 4

Total number of tasks in a random graph 10 20 50

InitialKE 1000120579 500120599 10Buffer 200KELossRate 02MoleColl 02PopSize 10

bed the makespan is averaged over 30 different randomgraph running instances Moreover to prove the robustness

14 Mathematical Problems in Engineering

0

20

40

60

80

100

120

140

CCR = 10

Mak

espa

n

HEFTDMSCRO

TMSCRODRSCRO

|P| = 4 |P| = 8 |P| = 16 |P| = 32

Figure 10 Average makespan for the molecular dynamics codeCCR = 10

0

20

40

60

80

100

120

140

CCR = 02

Mak

espa

n

HEFTDMSCRO

TMSCRODRSCRO

|P| = 4 |P| = 8 |P| = 16 |P| = 32

Figure 11 Average makespan for Gaussian elimination CCR = 02

of DRSCRO the best final value achieved in all these runsthe worst final value and the related standard deviation orvariance are also presented

531 Real-World Application Graphs Figures 10ndash13 showthe simulation experiment results of DRSCRO DMSCROTMSCRO and HEFT on the real-world application graphsand Tables 4ndash7 list the detail of the experimental results

As shown in Figures 10 and 11 it can be observed thatthe average makespan decreases as the processor numberincreases The results also show that DRSCRO TMSCROand DMSCRO achieve very similar performance which areall metaheuristic methods It is because according to No-Free-Lunch Theorem all well-designed metaheuristic meth-ods have the same performance on searching for optimalsolutions when averaged over all possible fitness functionsThe TMSCRO andDMSCROused in the simulation are well-designed and taken from the literature Therefore it provedthat DRSCRO developed in our work is also well-designed

A close observation of the results in Tables 10 and 11shows that DRSCRO outperforms TMSCRO and DMSCROon average slightly The reason is only because DRSCRO hasbetter capability of intensification search by applying VNS

01 02 1 2 50

50100150200250300350400450

CCR

Mak

espa

n

HEFTDMSCRO

TMSCRODRSCRO

Figure 12 Average makespan for the molecular dynamics code theprocessors number is 16

01 02 1 2 50

50100150200250300350400450

CCR

Mak

espa

n

HEFTDMSCRO

TMSCRODRSCRO

Figure 13 Average makespan for Gaussian elimination the proces-sors number is 8

and the utilization of one of its neighborhood structuresin the ineffective reaction operator as presented in the lastparagraph of Section 46 Therefore the performance of theaverage results obtained by DRSCRO is better than thatobtained by TMSCRO and DMSCRO when the stoppingcriterion is satisfied Moreover DRSCRO TMSCRO andDMSCRO typically outperform HEFT because they search awider area of the solution space as metaheuristic methodswhile the search of HEFT is narrowed down to a very smallerportion by means of the heuristics

Figures 12 and 13 and the results in Tables 7 and 8show the performance of the experimental results of thesefour algorithms with CCR value increasing It can be seenthat the makespan on average increases with the CCR valueincreasing It is because the heterogeneous processors are inthe idle state for longer as a result of the DAGs becomingmore communication-intensive It also can be observed thatDRSCRO TMSCRO and DMSCRO outperform HEFT andthe advantage becomes more significant with the value ofCCR increasing which suggest that heuristic algorithm likeHEFT has less consistent performance in a wide scheduling

Mathematical Problems in Engineering 15

Table 5 Experiment results for the molecular dynamics code CCR = 10

The number ofprocessors

HEFT(averagemakespan)

DMSCRO(averagemakespan)

TMSCRO(averagemakespan)

DRSCRO(averagemakespan)

DRSCRO(best

makespan)

DRSCRO(worst

makespan)

DRSCRO(variance)

4 120086 117624 116993 116759 114365 119589 24878 120074 116633 116007 115775 113402 118582 246616 120031 116101 115478 115247 112885 118041 245532 119993 115476 114856 114529 112182 117306 2440

Table 6 Experiment results for Gaussian elimination graph CCR = 02

The number ofprocessors

HEFT(averagemakespan)

DMSCRO(averagemakespan)

TMSCRO(averagemakespan)

DRSCRO(averagemakespan)

DRSCRO(best

makespan)

DRSCRO(worst

makespan)

DRSCRO(variance)

4 126852 124252 123585 123337 122042 126327 17698 96952 94965 94455 94266 92333 96551 200816 82356 80668 80235 80074 78433 82015 170632 75630 74080 73682 73535 72027 75317 1566

Table 7 Experiment results for the molecular dynamics code the processors number is 16

The value ofCCR

HEFT(averagemakespan)

DMSCRO(averagemakespan)

TMSCRO(averagemakespan)

DRSCRO(averagemakespan)

DRSCRO(best

makespan)

DRSCRO(worst

makespan)

DRSCRO(variance)

01 111782 109491 108903 108685 106457 111320 231502 112034 109737 109148 108930 106697 111571 23201 120031 116101 115478 115247 112885 118041 24552 160760 157465 156619 156306 153102 158532 30345 414581 406082 403902 403095 400878 404805 2125

Table 8 Experiment results for Gaussian elimination graph the processors number is 8

The value ofCCR

HEFT(averagemakespan)

DMSCRO(averagemakespan)

TMSCRO(averagemakespan)

DRSCRO(averagemakespan)

DRSCRO(best

makespan)

DRSCRO(worst

makespan)

DRSCRO(variance)

01 96612 94632 94124 93935 92010 96212 200102 96952 94965 94455 94266 92333 96551 20081 131094 128407 127717 127462 124849 130552 27152 222635 218071 216900 216467 214194 219550 24565 403600 395327 393204 392418 390260 394083 2069

scenario range and metaheuristic algorithm performs moreeffectively for communication-intensive DAGs

532 Randomly Generated Application Graphs As shownin Figures 14ndash16 randomly generated DAGs are used toevaluate the performance of DRSCRO TMSCRO DMSCROand HEFT in these experiments And the details of theseexperimental results are listed in Tables 9ndash11

Figure 14 shows the performance on the experimentalresults of these four algorithms with the processor numberincreasing As shown in Figure 14 DRSCRO always out-performs TMSCRO DMSCRO and HEFT as the numberof processors increases Figure 15 shows that DMSCRO has

better performance than the other three algorithms as thetask number increases The reasons for these are similarto those explained in the third paragraph of Section 531Figure 16 shows the makespan on average with CCR valuesincreasing It can be seem that the averagemakespan increasesrapidly with the increasing of the value of CCR As shown inFigure 16 the makespan on average increases rapidly whenthe value of CCR rises It is the fact that the DAG becomesmore communication-intensive with CCR increasing whichleads to the processors staying in the idle state for longer

54 Convergence Tests In this section the convergenceexperiments are conducted to show the change of makespan

16 Mathematical Problems in Engineering

Table 9 Experiment results for random graphs under different processor numbers task number is 50 and CCR = 02

The number ofprocessors

HEFT(averagemakespan)

DMSCRO(averagemakespan)

TMSCRO(averagemakespan)

DRSCRO(averagemakespan)

DRSCRO(best

makespan)

DRSCRO(worst

makespan)

DRSCRO(variance)

4 149400 146337 145552 145261 142283 148782 30948 119511 117061 116433 116200 113818 119017 247516 119473 117024 116396 116163 113782 118979 247432 119468 115550 114929 114700 112348 117480 2443

Table 10 Experiment results for random graphs under different task numbers processors number is 32 and CCR = 10

The numberof tasks

HEFT(averagemakespan)

DMSCRO(averagemakespan)

TMSCRO(averagemakespan)

DRSCRO(averagemakespan)

DRSCRO(best

makespan)

DRSCRO(worst

makespan)

DRSCRO(variance)

10 714484 706149 699100 690708 688982 693638 202520 1100918 1089685 1080428 1069082 1066410 1071479 261950 1666373 1662543 1649988 1634230 1633415 1636261 1165

Table 11 Experiment results of DRSCRO for the random graph under different CCRs the task number is 50

Value of CCR Processors number is 4 Processors number is 8 Processors number is 16 Processors number is 3201 145093 116068 116058 11459202 145261 116200 116163 1147001 153248 120228 119511 1185342 194401 166014 164246 1622925 508759 498427 492049 487830

0

50

100

150

CCR = 02

Mak

espa

n

HEFTDMSCRO

TMSCRODRSCRO

|P| = 4 |P| = 8 |P| = 16 |P| = 32

Figure 14 Average makespan for random graphs under differentprocessor numbers task number is 50 and CCR = 02

among DRSCRO TMSCRO and DMSCRO The conver-gence traces and significant tests are to further reveal thedifferences between DRSCRO and the other two algorithmsIn these experiments as suggested in [27] the stoppingcriteria of these three algorithms are that the total runningtime reaches a setting value (eg 180 s) Under the consid-eration of comparability the beginning of the time countingof DRSCRO is set as the start of the global optimizationphase processing In the first extensive test bed themakespanis averaged over 10 independent runs while in the secondextensive test bed the makespan is averaged over 30 differentrandom graph running instances

10 20 500

200400600800

10001200140016001800

The number of tasks

Mak

espa

n

HEFTDMSCRO

TMSCRODRSCRO

Figure 15 Average makespan for random graphs under differenttask numbers processors number is 32 and CCR = 10

541 Convergence Trace The convergence traces ofDRSCRO TMSCRO and DMSCRO for processing themolecular dynamics code and Gaussian elimination areplotted in Figures 17 and 18 respectively Figures 19ndash21show the convergence traces when processing the randomlygenerated DAG sets of which each contains 10 20 and50 tasks respectively As shown in Figures 17ndash21 it canbe observed that the convergence traces of these threealgorithms have obvious differences And the DRSCROconverges faster than the other two algorithms in every caseThe reason for the better rate of convergence of DRSCRO is as

Mathematical Problems in Engineering 17

01 02 1 2 50

100

200

300

400

500

600

CCR

Mak

espa

n

P = 32

P = 16P = 8P = 4

Figure 16 Average makespan of DRSCRO for the random graphunder different CCRs the task number is 50

0 1 2 3 4 5 6 7 8 9134

135

136

137

138

139

140

141

Time (Ms)

Mak

espa

n

DMSCROTMSCRO

DRSCRO

times104

Figure 17 Convergence trace for the molecular dynamics codeCCR = 1 and the number of processors is 16

Time (Ms)0 1 2 3 4 5 6 7 8 9

167

168

169

170

171

172

173

174

Mak

espa

n

DMSCROTMSCRO

DRSCRO

times104

Figure 18 Convergence trace for Gaussian elimination CCR = 02and the number of processors is 8

presented in the last paragraph of Section 46 (ie DRSCROtakes the advantage of its double-reaction structure toobtain a better super molecule for accelerating convergence)Even though the VNS algorithm adds the time cost in eachiteration the enhanced optimization capability of DRSCROalso makes it obtain a better coverage rate than TMSCRO

Time (Ms)0 1 2 3 4 5 6 7

505

51

515

52

525

53

535

54

Mak

espa

n

DMSCROTMSCRO

DRSCRO

times104

Figure 19 Convergence trace for the set of the randomly generatedDAGs with 10 tasks

Time (Ms)0 2 4 6 8 10 12 14

65566

66567

67568

68569

69570

705

Mak

espa

n

DMSCROTMSCRO

DRSCRO

times104

Figure 20 Convergence trace for the set of the randomly generatedDAGs with 20 tasks

Time (Ms)0 05 1 15 2 25 3 35 4 45

179

180

181

182

183

184

185

186

Mak

espa

n

DMSCROTMSCRO

DRSCRO

times105

Figure 21 Convergence trace for the set of the randomly generatedDAGs with 50 tasks

and DMSCRO The simulation experimental results showthat DRSCRO converges faster than TMSCRO by 194 onaverage (by 293 in the best case) and faster than DMSCROby 339 on average (by 412 in the best case)

Moreover the statistical analysis based on the averagevalues achieved is also presented in Section 542 to prove

18 Mathematical Problems in Engineering

Table 12 Results of Friedman tests 120572 = 005

Method Algorithm 119901 value Hypothesis

Friedman testDRSCRO DMSCRO 253119864 minus 02 RejectedDRSCRO TMSCRO 253119864 minus 02 Rejected

DRSCRO DMSCRO TMSCRO 670119864 minus 03 Rejected

Table 13 Results of Quade tests 120572 = 005

Method Algorithm 119901 value Hypothesis

Quade testDRSCRO DMSCRO 132119864 minus 02 RejectedDRSCRO TMSCRO 132119864 minus 02 Rejected

DRSCRO DMSCRO TMSCRO 109119864 minus 03 Rejected

that DRSCRO outperforms the other CRO-based algorithmsfor DAG scheduling from a statistical point of view

542 Significant Tests Statistical analysis is necessary for theaverage coverage rates obtained in all cases by DRSCROTMSCRO and DMSCRO which are metaheuristic methodsin order to find significant differences among these resultsNonparametric tests according to the recommendationsin [40] are specifically considered to be used since theexperimental results may present neither normal distributionnor variance homogeneity Therefore the Friedman test andthe Quade test are applied to check whether significantdifferences exist in the performance between these threealgorithms A significance level 120572 = 005 is used in allstatistical tests

Tables 12 and 13 respectively list the test results of theFriedman test and the Quade test which both reject the nullhypothesis of equivalent performance In both of these twotests our proposedDRSCRO is not only compared against allthe algorithms but also compared against the remaining onesas the control methodThe results in Tables 12 and 13 validatethe significant differences 120572 in the performance of DRSCROTMSCRO and DMSCRO

In sum it could be concluded that DRSCRO which is thecontrol algorithm statistically outperforms the other CRO-based DAG scheduling algorithm on coverage rate with asignificant level of 005

6 Discussion

The experimental results of makespan tests show that theperformance of DRSCRO is very similar to the other similarkinds of metaheuristic algorithms because when averagedover all possible fitness functions each well-designed meta-heuristic algorithm has the same performance for searchingoptimal solutions according to No-Free-Lunch TheoremHowever the proposed DRSCRO can achieve better perfor-mance and find good solutions faster than the other similarkinds of metaheuristic algorithms as the experimental resultsof convergence tests and the reason for it as the analysis inthe last paragraph in Section 46 is that DRSCROhas a bettersuper molecule creation bymetaheuristic method and underthe consideration of the optimization of scheduling orderand processor assignment DRSCRO takes the advantages of

VNS algorithm in the global optimization phase to improvethe optimization capability A load balance neighborhoodstructure is also applied in the ineffective reaction operatorfor a better intensification capability The new processorselection model utilized in the neighborhood structures alsopromotes the efficiency of VNS algorithm

7 Conclusion and Future Study

An algorithm named Double-Reaction-Structured CRO(DRSCRO) is developed for DAG scheduling on heteroge-neous systems in this paper DRSCRO includes two reactionphases one for super molecule selection and another forglobal optimizationThe phase of super molecule selection isused to obtain a super molecule by the metaheuristic methodfor better convergence rate different from other CRO-basedalgorithms forDAG scheduling on heterogeneous systems Inaddition to promote the intersection capability of DRSCROthe VNS algorithm which is with a new model for processorselection utilized in the neighborhood structures is used asthe initialization of global optimization phase and the loadbalance neighborhood structure of VNS is also applied inthe ineffective reaction operator The experimental resultsshow that DRSCRO can also achieve a higher speedup thanthe other CRO-based algorithms as far as we know AndDRSCRO algorithm can also obtain better performance onaveragemakespan in some cases

In future work we will analyze the parameter sensitivityof DRSCRO for promoting its activeness Moreover to makethe proposed algorithmmore practical DRSCROwill be alsoextended to aim at twomain objectives such as (1) minimiza-tion of schedule length (time domain) and (2) minimizationof number of used processors (resource domain)

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

This work is financially supported by the National NaturalScience Foundation of China (Grant no 61462073) and

Mathematical Problems in Engineering 19

the National High Technology Research and DevelopmentProgramof China (863 Program) (Grant no 2015AA020107)

References

[1] T Lewis and H Elrewini Introduction to Parallel ComputingPrentice-Hall New York NY USA 1992

[2] Y-K Kwok and I Ahmad ldquoStatic scheduling algorithms forallocating directed task graphs to multiprocessorsrdquo ACM Com-puting Surveys vol 31 no 4 pp 406ndash471 1999

[3] H Topcuoglu S Hariri and M-Y Wu ldquoPerformance-effectiveand low-complexity task scheduling for heterogeneous comput-ingrdquo IEEE Transactions on Parallel and Distributed Systems vol13 no 3 pp 260ndash274 2002

[4] Y-K Kwok and I Ahmad ldquoDynamic critical-path schedulingan effective technique for allocating task graphs to multiproces-sorsrdquo IEEETransactions on Parallel andDistributed Systems vol7 no 5 pp 506ndash521 1996

[5] F Suter F Desprez and H Casanova ldquoFrom heterogeneoustask scheduling to heterogeneous mixed parallel schedulingrdquo inEuro-Par 2004 Parallel Processing 10th International Euro-ParConference Pisa Italy August 31- September 3 2004 Proceed-ings vol 3149 of Lecture Notes in Computer Science pp 230ndash237Springer Berlin Germany 2004

[6] C-Y Lee J J Hwang Y-C Chow and F D Anger ldquoMultipro-cessor scheduling with interprocessor communication delaysrdquoOperations Research Letters vol 7 no 3 pp 141ndash147 1988

[7] M Maheswaran and H J Siegel ldquoA dynamic matching andscheduling algorithm for heterogeneous computing systemsrdquo inProceedings of the 7th Heterogeneous Computing Workshop pp57ndash69 IEEE Computer Society Orlando Fla USAMarch 1998

[8] M-Y Wu and D D Gajski ldquoHypertool a programming aidformessage-passing systemsrdquo IEEE Transactions on Parallel andDistributed Systems vol 1 no 3 pp 330ndash343 1990

[9] H El-Rewini and T G Lewis ldquoScheduling parallel programtasks onto arbitrary target machinesrdquo Journal of Parallel andDistributed Computing vol 9 no 2 pp 138ndash153 1990

[10] G C Sih and E A Lee ldquoCompile-time scheduling heuristic forinterconnection-constrained heterogeneous processor archi-tecturesrdquo IEEE Transactions on Parallel andDistributed Systemsvol 4 no 2 pp 175ndash187 1993

[11] B Kruatrachue and T Lewis ldquoGrain size determination forparallel processingrdquo IEEE Software vol 5 no 1 pp 23ndash32 1988

[12] M A Al-Mouhamed ldquoLower bound on the number of proces-sors and time for scheduling precedence graphs with communi-cation costsrdquo IEEETransactions on Software Engineering vol 16no 12 pp 1390ndash1401 1990

[13] A Tumeo C Pilato F Ferrandi D Sciuto and P L LanzildquoAnt colony optimization for mapping and scheduling in het-erogeneous multiprocessor systemsrdquo in Proceedings of the Inter-national Conference on Embedded Computer Systems Architec-turesModeling and Simulation (SAMOS rsquo08) pp 142ndash149 IEEEComputer Society Samos Greece June 2008

[14] I Ahmad and M K Dhodhi ldquoMultiprocessor scheduling in agenetic paradigmrdquo Parallel Computing vol 22 no 3 pp 395ndash406 1996

[15] M R Bonyadi and M E Moghaddam ldquoA bipartite geneticalgorithm for multi-processor task schedulingrdquo InternationalJournal of Parallel Programming vol 37 no 5 pp 462ndash487 2009

[16] R C Correa A Ferreira and P Rebreyend ldquoScheduling multi-processor tasks with genetic algorithmsrdquo IEEE Transactions onParallel andDistributed Systems vol 10 no 8 pp 825ndash837 1999

[17] E S H Hou N Ansari and H Ren ldquoGenetic algorithm formultiprocessor schedulingrdquo IEEE Transactions on Parallel andDistributed Systems vol 5 no 2 pp 113ndash120 1994

[18] F A Omara and M M Arafa ldquoGenetic algorithms for taskscheduling problemrdquo Journal of Parallel and Distributed Com-puting vol 70 no 1 pp 13ndash22 2010

[19] A Singh M Sevaux and A Rossi ldquoA hybrid grouping geneticalgorithm for multiprocessor schedulingrdquo in ContemporaryComputing Second International Conference IC3 2009 NoidaIndia August 17ndash19 2009 Proceedings vol 40 of Communica-tions in Computer and Information Science pp 1ndash7 SpringerBerlin Germany 2009

[20] A S Wu H Yu S Jin K-C Lin and G Schiavone ldquoAnincremental genetic algorithm approach to multiprocessorschedulingrdquo IEEE Transactions on Parallel and DistributedSystems vol 15 no 9 pp 824ndash834 2004

[21] Y Wen H Xu and J Yang ldquoA heuristic-based hybrid genetic-variable neighborhood search algorithm for task schedulingin heterogeneous multiprocessor systemrdquo Information Sciencesvol 181 no 3 pp 567ndash581 2011

[22] R Shanmugapriya S Padmavathi and S Shalinie ldquoContentionawareness in task scheduling using Tabu searchrdquo in Proceedingsof the IEEE International Advance Computing Conference pp272ndash277 IEEE Computer Society Patiala India March 2009

[23] S C Porto and C C Ribeiro ldquoA tabu search approach totask scheduling on heterogeneous processors under precedenceconstraintsrdquo International Journal of High Speed Computing vol7 no 1 pp 45ndash72 1995

[24] A V Kalashnikov and V A Kostenko ldquoA parallel algorithm ofsimulated annealing for multiprocessor schedulingrdquo Journal ofComputer and Systems Sciences International vol 47 no 3 pp455ndash463 2008

[25] A Y S Lam and V O K Li ldquoChemical-reaction-inspiredmeta-heuristic for optimizationrdquo IEEE Transactions on EvolutionaryComputation vol 14 no 3 pp 381ndash399 2010

[26] P Choudhury R Kumar and P P Chakrabarti ldquoHybridscheduling of dynamic task graphs with selective duplicationfor multiprocessors under memory and time constraintsrdquo IEEETransactions on Parallel and Distributed Systems vol 19 no 7pp 967ndash980 2008

[27] Y Xu K Li LHe andT K Truong ldquoADAG scheduling schemeon heterogeneous computing systems using double molecularstructure-based chemical reaction optimizationrdquo Journal ofParallel andDistributed Computing vol 73 no 9 pp 1306ndash13222013

[28] J Xu and A Lam ldquoChemical reaction optimization for the gridscheduling problemrdquo in Proceedings of the IEEE InternationalConference on Communications pp 1ndash5 IEEE Computer Soci-ety Cape Town South Africa May 2010

[29] B Varghese G McKee and V Alexandrov ldquoCan agent intel-ligence be used to achieve fault tolerant parallel computingsystemsrdquo Parallel Processing Letters vol 21 no 4 pp 379ndash3962011

[30] Y Jiang Z Shao and Y Guo ldquoA DAG scheduling scheme onheterogeneous computing systems using tuple-based chemicalreaction optimizationrdquo Scientific World Journal vol 2014 Arti-cle ID 404375 23 pages 2014

[31] J Xu AY S Lam and V O K Li ldquoStock portfolio selectionusing chemical reaction optimizationrdquo in Proceedings of theInternational Conference on Operations Research and FinancialEngineering pp 458ndash463 WASET Paris France June 2011

20 Mathematical Problems in Engineering

[32] N Mladenovic and P Hansen ldquoVariable neighborhood searchrdquoComputers amp Operations Research vol 24 no 11 pp 1097ndash11001997

[33] K Li X Tang and K Li ldquoEnergy-efficient stochastic taskscheduling on heterogeneous computing systemsrdquo IEEE Trans-actions on Parallel and Distributed Systems vol 25 no 11 pp2867ndash2876 2014

[34] D HWolpert andWGMacready ldquoNo free lunch theorems foroptimizationrdquo IEEE Transactions on Evolutionary Computationvol 1 no 1 pp 67ndash82 1997

[35] M A Khan ldquoScheduling for heterogeneous Systems usingconstrained critical pathsrdquo Parallel Computing vol 38 no 4-5pp 175ndash193 2012

[36] P Hansen N Mladenovic and J M Perez ldquoVariable neigh-borhood search methods and applicationsrdquo 4OR-A QuarterlyJournal of Operations Research vol 6 no 4 pp 319ndash360 2008

[37] F Glover and G Kochenberger Handbook of MetaheuristicsSpringer New York NY USA 2003

[38] S Kim and J Browne ldquoA general approach to mapping of par-allel computation upon multiprocessor architecturesrdquo in Pro-ceedings of the International Conference on Parallel Processingpp 1ndash8 Pennsylvania State University Pennsylvania State Uni-versity Press University Park Pa USA August 1988

[39] V A F Almeida I M M Vasconcelos J N C Arabe and D AMenasce ldquoUsing random task graphs to investigate the potentialbenefits of heterogeneity in parallel systemsrdquo in Proceedingsof the ACMIEEE Conference on Supercomputing pp 683ndash691IEEE Computer Society Minneapolis Minn USA November1992

[40] S Garcıa A Fernandez J Luengo and F Herrera ldquoAdvancednonparametric tests for multiple comparisons in the design ofexperiments in computational intelligence and data miningexperimental analysis of powerrdquo Information Sciences vol 180no 10 pp 2044ndash2064 2010

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 2: Research Article DRSCRO: A Metaheuristic Algorithm for ...downloads.hindawi.com/journals/mpe/2015/396582.pdf · Research Article DRSCRO: A Metaheuristic Algorithm for Task Scheduling

2 Mathematical Problems in Engineering

for DAG scheduling on heterogeneous system so far accord-ing to our knowledge These two algorithms both focusedon the DAG scheduling with the objective of minimizing themakespan However as metaheuristic scheduling strategiesCRO-based algorithms for DAG scheduling still have veryhigh time cost and the convergence rates of them also needto be improved In [30] the concept of super molecule isapplied for accelerating convergence and the super moleculeis selected by heuristic scheduling strategies However theperformance of this kind of super molecule selectionmethodis affected by the range of problems

This paper proposes an algorithm Double-Reaction-Structured CRO (DRSCRO) for DAG task scheduling onheterogeneous systems to aim at obtaining schedules withbetter quality In this paper the conventionalCRO frameworkscheme is modified and two reaction phases one for supermolecule selection and another for global optimization aredeveloped in DRSCRO CRO as a metaheuristic algorithmis utilized in the molecule selection phase to obtain a supermolecule [31] for better convergence rate And the variableneighborhood search (VNS) algorithm [32] method with anew processor selection model as well as its neighborhoodstructure is also utilized to promote the intensificationcapability in the global optimization phase

There are three major contributions of this work

(1) Developing DRSCRO by modifying the conven-tional CRO framework and utilizing a metaheuristicmethod to obtain a super molecule for acceleratingconvergence

(2) Utilizing the VNS [32] algorithm with a new pro-cessor selection model as the global optimizationphase initialization which takes into account theoptimization of the scheduling order and processorassignment and applying one of its neighborhoodstructures in the reaction operator to promote theintensification capability of DRSCRO

(3) Conducting simulation experiments to prove theefficiency and effectiveness of DRSCRO in terms ofmakespan and convergence rate

The next section introduces relevant research works onthe DAG scheduling problem on heterogeneous systemsSection 3 describes the models of the studied problem as for-mal statement Section 4 presents the design of the proposedDRSCRO for DAG scheduling In Section 5 the simulationperformance of DRSCRO is analyzed and compared withsome existing scheme algorithms Section 6 draws the con-clusions of this paper and the suggestions for future research

2 Literature Review

The DAG scheduling problem which has been proven to beNP-hard in general [1] can be formulated as the search foran optimal solution to the assignment of the tasks in DAGonto a set of processors to minimize the total schedulinglength (iemakespan)There are twomain categories heuris-tic (deterministic) and metaheuristic (nondeterministic) ofthe various scheduling algorithms proposed over the last

decade As metaheuristic methods CRO-based algorithmsfor DAG scheduling on heterogeneous systems are based onChemical Reaction Optimization (CRO) algorithm whichwas proposed very recently and has shown its power to dealwith NP-complete problems

21 Heuristic and Metaheuristic Methods The heuristicmethods are on the basis of the heuristics which are extractedfrom intuitions and the most important class of them is listscheduling algorithms [2ndash12] The HEFT algorithm whichwas proposed by Topcuoglu et al [3] utilizes the informationof execution cost on average of each task as an upward-ranking heuristic to calculate the task priority At each stepof HEFT the task with the highest value of upward rank isselected andmapped to the processor with a greedy approach(ie the assigned processor minimizes the earliest finishtime of the selected task) Experimental results prove thatHEFT obtains better performance on schedule quality andcomputational cost than the other list scheduling algorithmsThe performance of heuristic-based algorithms heavily reliedon the effectiveness of the heuristics The higher complexityDAG scheduling problems have the harder greedy heuristicsproduce consistent results on a wide range of problemsIn particular GA has been widely used to evolve solutionsfor many task scheduling problems as the most represen-tative metaheuristic method [21] Different from heuristic-based algorithms the metaheuristic methods use a guided-random-search-based process for solution searching Theytypically require sufficient sampling of candidate solutions inthe search space and have shown robust performance on avariety of scheduling problems For solving DAG schedulingproblem successfully many metaheuristic algorithms havebeen utilized such as GA [14ndash21] ACO [13] SA [24] TS [2223] CRO [27 30] VNS [21] and energy-efficient stochastic[33]

According to No-Free-Lunch Theorem [34] all well-designed metaheuristic methods have the same performanceon searching for optimal solutions when averaged over allpossible fitness functions In comparison with the heuristicmethods the metaheuristic methods which always havemuch higher computational cost can obtain better perfor-mance in terms of schedule quality because themetaheuristicmethods can search a wider area of the solution spacewith the guided-random-search-based processes for solutionsearching while the search of the heuristic-based algorithmsare narrowed down to a very smaller portion by means of theheuristics

22 CRO-Based Algorithms for DAG Scheduling on Hetero-geneous Systems CRO was proposed by Lam and Li veryrecently [25] and as far as we know as metaheuristic meth-ods Double Molecular Structure-Based Chemical ReactionOptimization (DMSCRO) [27] and Tuple-Based ChemicalReaction Optimization (TMSCRO) [30] are the only twoCRO-based algorithms for DAG scheduling on heteroge-neous systems CRO-based algorithms mimic the chemicalreaction process which accords with energy conservation ina closed container The molecules with two kinds of energypotential energy (PE) and kinetic energy (KE) in CRO-based

Mathematical Problems in Engineering 3

Table 1 Parameters used in CRO

Parameters DefinitionPE Current potential energy of a moleculeKE Current kinetic energy of a moleculeInitialKE Initial kinetic energy of a molecule120579 Threshold value guides the choice of on-wall collision or decomposition120599 Threshold value guides the choice of intermolecule collision or synthesisBuffer Initial energy in the central energy bufferKELossRate Loss rate of kinetic energyMoleColl Threshold value to determine whether to perform a unimolecule reaction or an intermolecule reactionPopSize Size of the moleculesiters Number of iterations

algorithms are the solutions to DAG scheduling problemThe PE value of a molecule is calculated by fitness functionwhich is equal to the objective value makespan of thecorresponding solution And KE is for helping the moleculeescape from local optimums and its value is nonnegativeA buffer is also used in CRO-based algorithms for energyinterchange and conservation Moreover to find the solutionwith the global minimal makespan four types of elementarychemical reactions on-wall ineffective collision decompo-sition intermolecular ineffective collision and synthesisare applied for the intensification and the diversificationsearches The typical execution flow of CRO frameworkadopted in DMSCRO and TMSCRO is as proposed in [25]and the parameters used in CRO are presented in Table 1

As metaheuristic methods DMSCRO and TMSCROhave better performance in terms of schedule quality thanheuristic methods and the reason is as presented in the lastparagraph of Section 21 The experimental results in [27 30]prove that both of DMSCRO and TMSCRO outperform GADMSCRO is the first algorithm by applying CRO proposedby Lam and Li in [25] to solve the DAG scheduling problemand it enjoys the advantages of both GA and SA On theone hand the intermolecular collision and on-wall collisiondesigned in DMSCRO have similar effect to the crossoveroperation and the mutation operation in GA respectivelyOn the other hand the energy conservation requirementin DMSCRO is able to guide the searching of the optimalsolution similarly to the way the Metropolis Algorithm of SAguides the evolution of the solutions in SA Two additionaloperations decomposition and synthesis give DMSCROmore opportunities to jump out of the local optimum andexplore the wider areas in the solution space This benefitenables DMSCRO to find good solutions faster than GAwhich has been widely used to evolve solutions for many taskscheduling problems DMSCRO are not compared with SAin [27 30] because the underlying principles and philoso-phies between DMSCRO and SA differ a lot [27] Typicallymetaheuristic algorithms like CRO-based algorithm of GA-based algorithms operating on a population of solutions areable to find good solutions faster than that operating on asingle solution like SA-based algorithms Comparing withDMSCRO TMSCRO applies constrained earliest finish time

algorithm to data pretreatment to take the advantage of thesuper molecule and constrained critical paths [35] whichis as heuristic information for accelerating convergenceMoreover the molecule structure and elementary reactionoperators design in TMSCRO are more reasonable thanthose in DMSCRO on intensification and diversification ofsearching the solution space

However for solving the NP problem of DAG schedul-ing on heterogeneous systems CRO-based algorithmsTMSCRO and DMSCRO still have very large time expen-diture as metaheuristic scheduling strategies therefore thesearching capabilities and convergence rates of them need tobe improved There are three deficiencies of TMSCRO andDMSCRO First in [30] the concept of super molecule isapplied for accelerating convergence and the super moleculeis selected by heuristic scheduling strategies but the perfor-mance of this kind of super molecule selection method isaffected by the range of problems Second in both TMSCROand DMSCRO the initial molecules which are very impor-tant for the whole searching process are randomly createdand the uncertainty of this kind of initialization underminesthe searching capabilities of TMSCRO and DMSCRO More-over the intensification capabilities of CRO-based algorithmsfor DAG scheduling also need to be improved to obtainbetter performances of the average results when the iterationstopping criterions are satisfied

Therefore this paper proposes an algorithm Double-Reaction-Structured CRO (DRSCRO) for DAG taskscheduling on heterogeneous systems to aim at obtainingschedules with better quality In this paper the conventionalCRO framework scheme is modified and two reactionphases one for super molecule selection and another forglobal optimization are developed in DRSCRO CRO as ametaheuristic algorithm is utilized in the molecule selectionphase to obtain a super molecule [31] for better convergencerate Moreover in the global optimization phase the variableneighborhood search (VNS) algorithm method [21 32 36]which is an effective metaheuristic with the utilizationsof neighborhood structures and a local search to changethe neighborhood systematically is used to optimize theinitial molecule and one of its neighborhood structuresis also adopted in the reaction operator to promote the

4 Mathematical Problems in Engineering

intensification capability And there is a newmodel proposedfor processor selection utilized in the neighborhoodstructures of the VNS algorithm for better effectiveness

Moreover in [21] VNS was incorporated with GA forDAG scheduling but the task priority was unchangeable inthe VNS algorithm in [21] which reduces the efficiency ofVNS to obtain a better solution So different from [21] topromote the intensification capability of the whole algorithmthe VNS in DRSCRO is modified under the consideration ofthe optimization of the scheduling order and the processorassignment both

3 Problem Formulation

The DAG scheduling problem is typically with two inputsa heterogeneous system for task computing in parallel anda parallel program of application (ie DAG) In this paperthe heterogeneous system is assumed as a static computingsystem model presented by 119875 = 119901

119894| 119894 = 1 2 3 |119875|

which is a fully connected network of processorsThe hetero-geneity level in this paper is formulated as (1+hl)(1minushl)where the parameter hl isin (0 1) In this paper EcCost

119901119895(V119894)

represents the computation cost of a task V119894mapped to the

processor 119901119895and the value of each EcCost

119901119895(V119894) is randomly

chosen within the scope of [1 minus hl 1 + hl]In general DAG = (119881 119864) consists of a task (node) set

119881 and an edge set 119864 EcCost119901119895

(V119894) is as defined in the first

paragraph of this section and the same processor executesa task in the DAG without preemption The constraintbetween tasks V

119894and V

119895is denoted as the edge 119890

119894119895(119890119894119895

isin

119864) which means that the execution of task V119895only after

the execution result of task V119894has been transmitted to task

V119895 Each edge 119890

119894119895has a nonnegative weight comm(V

119894 V119895)

denoting the communication cost between V119894and V

119895 Each

task in a DAG can only be executed on one processor andthe communication can be performed simultaneously bythe processors In addition when two communicating tasksare mapped to the same processor the communication costof them is zero Predecessor (V

119894) represents the set of the

predecessors of V119894 while successor (V

119894) represents the set of

the successors of V119894 The task with no predecessor is denoted

as Ventry while the task with no successor is denoted as VexitConsider that there is a DAGwith |119881| tasks to be mapped

to a heterogeneous system with |119875| processors Assuming thehighest-priority ready task V

119894on the processor 119901

119895 the earliest

start time of V119894 119879ESTime(V119894 119901119895) can be formulated as

119879ESTime (V119894 119901119895) = max 119879avail (119901

119895) 119879ready (V

119894 119901119895) (1)

where 119879avail(119901119895) can be defined as (2) 119879avail(119901119895) is the timewhen processor 119901

119895is available to the execution of the task V

119894

119879avail (119901119895) = max

V119896isinexec(119901119895)119879AFTime (V

119896) (2)

where exec(119901119895) represents all the tasks which have already

been scheduled on the processor 119901119895and 119879AFTime(V119896) denotes

the actual finish time when the task V119896finishes its execution

119879ready(V119894 119901119895) in (1) represents the time when all the dataneeded for the process of V

119896have been transmitted to 119901

119895

which is formulated as

119879ready (V119894 119901119895)

= maxV119896isinpredecessor(V119894)

119879AFTime (V119896) + comm (V

119896 V119894)

(3)

where 119879AFTime(V119896) has the same definition in (2) and prede-cessor (V

119894) denotes the set of all the immediate predecessors of

task V119894 comm(V

119896 V119894) is 0 if the task V

119896and task V

119894are mapped

to the same processor 119901119895

If task V119894is mapped to the processor 119901

119895with nonpreemp-

tive processing approach the earliest finish time of task V119894

119879EFTime(V119894 119901119895) is formulated as

119879EFTime (V119894 119901119895) = 119879ESTime (V

119894 119901119895) + EcCost

119901119895(V119894) (4)

After the task V119894is executed by the processor 119901

119895

119879EFTime(V119894 119901119895) is assigned to 119879AFTime(V119894) Themakespan of theentire parallel program is equivalent to the actual finish timeof exit task Vexit

119898119886119896119890119904119901119886119899 = maxV119894isin119881

119879AFTime (V119894) = 119879AFTime (Vexit) (5)

The computation of the communication-to-computationratio (CCR) can be formulated as in

CCR =

sum119890119894119895isin119864

comm (V119894 V119895)

sumV119894isin119881119882 (V119894)

(6)

where 119882(V119894) is the average computation cost of task V

119894and it

can be calculated as follows

119882 (V119894) =

|119875|

sum

119896=1

EcCost119901119896

(V119894)

|119875| (7)

A simple four-task DAG and a heterogeneous computa-tion system with three processors are shown in Figures 1(a)and 1(b) respectively The definition of the notations can befound in Table 2

4 Design of DRSCRO

DRSCRO imitates the molecular interactions in chemicalreactions based on the concepts of atoms molecule molec-ular structure and energy of a molecule In DRSCRO amolecule corresponds to a scheduling solution in DAGscheduling with a unique molecular structure representingthe atom positions in a molecule We utilize the molecularstructure of TMSCRO in our work under the considerationof its capability to represent the constrained relationshipbetween the tasks in a molecule (solution) In addition theenergy of each molecule corresponds to the fitness valueof a solution The molecular interactions try to reconstructmore stable molecular structure with lower energy There arefour kinds of basic chemical reactions on-wall ineffective

Mathematical Problems in Engineering 5

Table 2 Definitions of notations

Notations Definitions

DAG = (119881 119864) Input directed acyclic graph with |119881| nodes representing tasks and |119864| edgesrepresenting constrained relations among the tasks

119875 = 119901119894| 119894 = 1 2 3 |119875| Set of heterogeneous processors in target system

EcCost119901119895(V119894) Execution cost of task V

119894using processor 119901

119895

comm(V119896 V119894) Communication cost from task V

119896to task V

119894

119879ESTime(V119894 119901119895) The earliest start time of task V119894which is mapped to processor 119901

119895

119879EFTime(V119894 119901119895) The earliest finish time of task V119894which is mapped to processor 119901

119895

119879avail(119901119895) The time when processor 119901119895is available

119879ready(V119894 119901119895) The time when all the data needed for the process of V119894have been transmitted to 119901

119895

119879AFTime(V119896) Actual finish time when task V119896finishes its execution

predecessor(V119894) Set of the predecessors of task V

119894

successor(V119894) Set of the successors of task V

119894

exec(119901119895) Set of the tasks which have already been scheduled on the processor 119901

119895

119882(V) Average computation cost of task VCCR Communication-to-computation ratiohl Parameter for adjusting the heterogeneity level in a heterogeneous system

Start

End

516

1714

0

0

1(8)

4(10)

3(15)

2(14)

(a)

p2 p3

p1

(b)

Figure 1 (a) DAG model (b) Heterogeneous computation system model

collision decomposition intermolecular ineffective collisionand synthesis for molecular interactions in DRSCRO andeach kind of reaction contains two subclasses These twosubclasses of reaction operators are applied in the phase ofsuper molecule selection and the phase of global optimiza-tion respectively

41 Framework of DRSCRO The framework of DRSCRO toschedule a DAG job is as shown in Figure 2 with two basicphases the phase of supermolecule selection and the phase ofglobal optimization In each phase DRSCRO first initializes

the process of a phase and then the phase process entersiteration

In this framework DRSCRO first executes the phase ofsuper molecule selection to obtain the super molecule (iejust themolecule with the global minimalmakespan) SMolewith other output molecules as the input of the next globaloptimization phase for the first time (the input of VNSalgorithm is the population with SMole after each iterationin the global optimization phase in the other times) andthen DRSCRO performs the phase of global optimization toapproach the global optimum solution The VNS algorithm

6 Mathematical Problems in Engineering

The global min pointYes

Intermolecularcollision

Molecule selection (one is chosen)

Molecule selection (two or more are chosen)

Satisfy the criteria of decomposition for super

molecule

Satisfy the criteria

molecule selection

OnWallSMS DecompSMS IntermoleSMS SynthSMS

Fit evaluation

No Yes

YesNo No Yes

No

Start

Initialization of the reaction for super molecule selection

Next phase criteria matched

Stopping criteria matched

Initialization of the reaction for globaloptimalization (InitMoleGOVNS)

Yes

Intermolecularcollision

Molecule selection (oneis chosen)

Molecule selection (two or more are chosen)

Satisfy the criteria of decomposition for global

optimalization

Satisfy the criteria

OnWallGO DecompGO IntermoleGO SynthGO

Fit evaluation

No Yes

YesNo No Yes

No

selection

of synthesis for super

of synthesis for globaloptimalization

Figure 2 Framework of DRSCRO

Mathematical Problems in Engineering 7

(1)makespan = 0(2) for each node V

119896inm = ((V

1 1198911 1199011) (V2 1198912 1199012) (V

|119881| 119891|119881| 119901|119881|)) do

(3) calculate the actual finish time of V119896(ie 119879AFTime(V119896))

(4) if 119898119886119896119890119904119901119886119899 lt 119879AFTime(V119896)(5) updatemakespan(6) makespan = 119879AFTime(V119896)(7) end if(8) end for(9) returnmakespan

Algorithm 1 Fit(m) calculating the fitness value of a molecule and the processor allocation optimization

with a new model for processor selection is adopted asthe initialization of the global optimization phase and itis also utilized as a local search process to promote theintensification capability of DRSCROThere are four kinds ofelementary chemical reaction in DRSCRO on-wall collisiondecomposition intermolecular collision and synthesis Andeach kind of reaction contains two types of operators whichare respectively utilized in two phases of DRSCRO In eachiteration one of the elementary chemical reaction operatorsis performed to generate new molecules and the PEs of thenewly generated molecules (ie the fitness function valuesof the newly generated molecules) will be calculated Inaddition SMole will be tracked and only participates inon-wall ineffective collision and intermolecular ineffectivecollision in the global optimization phase to explore as muchas possible the solution space in its neighborhoods and themain purpose is to prevent the supermolecule from changingdramatically The iteration of each phase repeats until thestopping criteria (or next phase criteria) are met and SMoleand its fitness function value are just the final solutionand makespan (ie global min point) respectively In theimplementations of the experiments in this paper the nextphase criteria and the stop criteria ofDRSCROare set aswhenthere is no makespan improvement after 10000 consecutiveiterations in the search loop

42 Molecular Structure and Fitness Function This subsec-tion presents the encoding of scheduling solutions (ie themolecular structure) and the statement of the fitness functionin DRSCRO

421 Molecular Structure In this paper an atom with threeelements can be denoted as a tuple (V

119894 119891119894 119901119894) and the molec-

ular structure M with an array of tuples can be formulatedas in (8) to represent a solution to the DAG schedulingproblem The order of the tuples in M represents the priorityof each DAG task V

119894with the allocated processor 119901

119894 and V =

(V1 V2 V

|119881|) is a topological sequence of DAG which is

with the hypothetical entry task (with no predecessors) V1and

exit task (with no successors) V|119881| respectively representing

the beginning and end of execution Moreover if tuple A isbefore tuple B and VA is the predecessor of VB in DAG thesecond integer of tuple B 119891B will be 1 and vice versa

m = ((V1 1198911 1199011) (V2 1198912 1199012) (V

|119881| 119891|119881|

119901|119881|

)) (8)

422 Fitness Function Potential energy (PE) is defined asthe fitness function value of the corresponding solutionrepresented by 119878 The overall schedule length of the entireDAG namely makespan is the largest finish time amongall tasks which is equivalent to the actual finish time ofthe exit node in DAG In this paper the goal of DAGscheduling problem by DRSCRO is to obtain the schedulingthat minimizes makespan and ensure that the precedence ofthe tasks is not violated Hence each fitness function value isdefined as

Fit (m) = PEm = 119898119886119896119890119904119901119886119899 (9)

Algorithm 1 presents how to calculate the value of theoptimization fitness function Fit(m)

43 Super Molecule Selection Phase

431 Initialization There are two kinds of initial moleculegenerator one used in the phase of super molecule selectionand the other used in the phase of global optimization togenerate the initial solutions for DRSCRO tomanipulateThetuples of the first moleculem used in the initialization of thephase of super molecule selection are ascendingly orderedby the upward rank value [27] of their V

119894 and element three

119901119894of each tuple is generated by a random perturbation The

upward rank value can be calculated by

Rank (V119894)

= 119882 (V119894)

+ maxV119895isinsuccessor(V119894)

comm (V119894 V119895) + Rank (V

119895)

(10)

A detailed description of the initial molecule generator ofthe super molecule selection phase is given in Algorithm 2For the first input molecule m 119901

119909in each tuple inm is set as

1199011

432 Elementary Chemical Reaction Operators InDRSCROthe operators for super molecule selection just randomlychange 119901

119894of each tuple in a molecule as the intensification

searches or the diversification searches [25] to optimize theprocessor mapping of a solution Figures 3 4 5 and 6respectively show the examples of four operators for super

8 Mathematical Problems in Engineering

(1) 119872119900119897119890119873 = 1(2) while MoleN le PopSize do(3) for each 119901

119894in moleculem to randomly change

(4) change 119901119894randomly

(5) end for(6) generate a new moleculem1015840(7) MoleN =MoleN + 1(8) end while

Algorithm 2 InitMoleSMS(m) generating the initial populationfor the super molecule selection phase

Molecule

New molecule

(1 0 p1)

(1 0 p1)

(2 1 p2)

(2 1 p2)

(4 0 p2) (3 1 p3)

(4 0 p3) (3 1 p1)

Figure 3 Example of OnWallSMS

New molecule 1

New molecule 2

Molecule

(1 0 p3) (2 1 p2)

(4 0 p3)

(4 0 p1) (3 1 p1)

(3 1 p1)(2 1 p2)(1 0 p1)

(1 0 p1) (2 1 p3) (4 0 p3) (3 1 p2)

Figure 4 Example of DecompSMS

molecule selection in which the molecules correspond to theDAG as shown in Figure 1(a) And the white blocks in theseexamples denote the tuples that do not change during thereaction operation calculations

As shown in Figure 3 the operatorOnWallSMS is used togenerate a new molecule m1015840 from a given reaction moleculem for optimization OnWallSMS works as follows (1) Theoperator randomly chooses a tuple (V

119894 119891119894 119901119894) in m (2)

The operator changes 119901119894randomly In the end the operator

New molecule 1

New molecule 2

Molecule 1

Molecule 2

(1 0 p1)

(1 0 p1)

(1 0 p1)

(1 0 p1)

(2 1 p1)

(2 1 p2)

(2 1 p2)

(2 1 p1)

(3 1 p1)

(3 1 p3)

(4 0 p3)

(4 0 p3)

(4 0 p3)

(4 0 p1)

(3 1 p2)

(3 1 p3)

Figure 5 Example of IntermoleSMS

Molecule 1

Molecule 2

New molecule

(4 0 p3)

(4 0 p3)

(2 1 p1)

(2 1 p2)(1 0 p2)

(1 0 p2)

(1 0 p1) (2 1 p3)

(3 1 p3)

(3 1 p1)

(4 0 p3) (3 1 p1)

Figure 6 Example of SynthSMS

generates a new molecule m1015840 from m as an intensificationsearch

As shown in Figure 4 the operator DecompSMS is usedto generate newmoleculesm1015840

1andm1015840

2from a given reaction

moleculem DecompSMS works as follows (1) The operatorgenerates two molecules m1015840

1= m and m1015840

2= m (2)

The operator keeps the tuples in m10158401 which is at the odd

position in m and then changes the remaining 119901119909rsquos of tuples

inm10158401 randomly (3) The operator retains the tuples in m1015840

2

which is at the even position in m and then changes theremaining 119901

119909rsquos of tuples in m1015840

2randomly In the end the

operator generates two new molecules m10158401and m1015840

2from m

as a diversification search

Mathematical Problems in Engineering 9

(1) tempSet = pop set(2) pop subset = 0(3) if pop set is the input of the VNS algorithm for the first time (ie the output of the super molecule selection phase)(4) for each tempS in tempSet except SMole(5) choose a tuple (V

119894 119891119894 119901119894) in tempS where 119891

119894= 0 randomly

(6) generate a random number 119903119899119889 isin (0 1)(7) if rndge 05(8) find the first predecessor vj = Pred(vi) from vi to the begin in molecule tempS(9) interchanged position of (vi f i pi) and (V

119895+1 119891119895+1

119901119895+1

) in molecule tempS(10) update f i 119891119894+1 and 119891

119895+1as defined in the last paragraph of Section 421

(11) end if(12) for each pi in molecule tempS to randomly change(13) change pi randomly(14) end for(15) if Fit(tempS) lt Fit(SMole)(16) SMole = tempS(17) end if(18) end for(19) end if(20) pop subset adds SMole(21) pop subset adds the molecules in tempSet with tuple order different from SMole(22) while |119901119900119901 119904119906119887119904119890119905| = 119901119900119901 119904119906119887119904119890119905 119899119906119898 do(23) pop subset add a molecules in pop set which do not exist in pop subset(24) end while

Algorithm 3 InitVNS(pop set pop subset num) initializing the subset of the population pop set for undergoing the VNS algorithm

As shown in Figure 5 the operator IntermoleSMS isused to generate new molecules m1015840

1and m1015840

2from given

molecules m1and m

2 This operator first uses the steps in

OnWallSMS to generatem10158401fromm

1 and then the operator

generates the other newmoleculem10158402fromm

2in the similar

fashion In the end the operator generates two newmoleculesm10158401andm1015840

2from m

1and m

2as an intensification search

As shown in Figure 6 the operator SynthSMS is usedto generate a new molecule m1015840 from given molecules m

1

and m2 SynthSMS works as follows The operator keeps the

tuples inm1015840 which is at the same position inm1andm

2with

the same 119901119909rsquos and then changes the remaining 119901

119910rsquos in m1015840

randomly As a result the operator generatesm1015840 fromm1and

m2as a diversification search

44 Global Optimization Phase

441 Initialization VNS is utilized by our proposed algo-rithm as the initialization of the global optimization phaseand it is also as a local search process to promote theintensification capability of DRSCRO during the running ofthe whole algorithm

Algorithms 3 and 4 respectively present the subset gener-ator of the phase output and the main steps of the whole VNSalgorithm (ie the initialization of the global optimizationphase) In DRSCRO the VNS algorithm only processes thesubset of the population with the super molecule SMoleafter each iteration in the global optimization phase (theoutput of super molecule selection phase is the input of VNSfor the first time) As presented in Algorithm 3 if the pop set(ie the set of population) is the output of the super molecule

selection phase the tuple orders and 119901119894s of its elements

will be adjusted pop subset is the subset of population andpop subset num is the number of the elements in pop subsetwhich is set as PopSize times 50 in this paper

In Algorithm 4 different from the VNS proposed in [21]the task priority was changeable in theVNS algorithmused inDRSCRO the reason for which is that the unchangeable taskpriority in the VNS reduces its efficiency to obtain a bettersolutionTherefore under the consideration of the optimiza-tion of the scheduling order and the processor assignmentboth the input molecules of VNS can be with differenttuple order (ie task priority) as presented in Algorithm 3in each iteration 119889max is set to 2 as presented in [37] Asthe essential factor of VNS two neighborhood structuresload balance and communication reduction neighborhoodstructures which demonstrate their power in solving DAGscheduling problem on heterogeneous systems as presentedin [21] are adopted by the VNS algorithm in DRSCRO fortheir high efficiency In this paper a new model is alsoproposed for processor selection of these two neighborhoodstructures As presented in [21] there are two intuitionsused to construct the neighborhood structures One isthat balancing load among various processors usually helpsminimizing the makespan especially when most tasks areallocated to only a few processors the other is that reducingcommunication overhead and idle waiting time of processorsalways results in a more effective schedule especially givena relatively high unit communication However there is acontradiction between these two intuitions because reducingcommunication overhead and idle waiting time of processorsalways means that some processors are with most tasks

10 Mathematical Problems in Engineering

(1) pop subset = InitVNS(pop set)(2) Select the set of neighborhood structures 119873119890119894119892ℎ

119889(119889 = 1 2 3 119889max)

(3) for each individualm in the pop subset do(4) d = 1(5) while 119889 lt 119889max do(6) Randomly generate a moleculem

1from the 119889th neighborhood ofm

(7) Apply some local search method withm1as the initial molecule (the local optimum presented bym

2)

(8) If m2is better thanm

(9) m = m2

(10) 119889 = 1(11) else(12) d = d + 1(13) end if(14) end while(15) until the termination condition is satisfied(16) end for(17) execute the combination strategy

Algorithm 4 InitMoleGOVNS(pop set) generating the initial population of the global optimization phase

(1) for each processor 119901119894in the solution 120596 do

(2) Compute Tend(pi)(3) end for(4) Choose the processor 119901max with the largest Tend(119901max)(5) Randomly choose a task Vrandom from exec(119901max)(6) Randomly choose a processor 119901random different from 119901max(7) Reallocate Vrandom to the processor 119901random(8) Encode and reschedule the changed solution 1205961015840(9) return 1205961015840

Algorithm 5 GenNeighborhoodofLoadBalance(120596)

So different from the original ones in [21] we develop anew model for processor selection Let TCload(119901

119894) be all

the task execution cost of processor 119901119894 and TCcomm(119901

119894)

is the communication cost overhead of processor 119901119894as

defined in [21] The values of TCload(119901119894) and TCcomm(119901

119894)

are the tendencies of load balancing and communicationreducing respectively (ie the tendency of task reducing orincreasing) The greater TCload(119901

119894) is the stronger tendency

of reducing tasks on 119901119894is and the greater TCcomm(119901

119894) is the

stronger tendency of increasing tasks on 119901119894is Therefore a

parameter Tend(119901119894) is developed to measure the tendency

with the combination of TCload(119901119894) and TCcomm(119901

119894) as (11)

The neighborhood structure computation processes of loadbalance and communication reduction are as presented inAlgorithms 5 and 6 respectively The proposed model isunder the comprehensive consideration of both intuitionsand can make the VNS algorithm more effective than theoriginal one

Tend (119901119894) =

1

1 + 119890minus(TCload(119901119894)TCcomm(119901119894)) (11)

The VNS algorithm in DRSCRO utilizes the dual termi-nation criteria The termination criterion 1 sets the upperbound of the local search iterations to 20 and the termination

criterion 2 sets the maximum iteration number withoutimprovement to 3 The VNS algorithm will stop if eithercriterion is satisfied To form a new initial population acombination strategy is utilized for combining the currentpopulation and the VNS output after the VNS algorithmoutputs the subset of the population The current populationand the VNS output are first merged and sorted by increasingmakespan then the first PopSize molecules are selected togenerate the new initial population

442 Elementary Chemical Reaction Operators The opera-tors for global optimization not only vary 119901

119894of each tuple

but also interchange the positions of the tuples in a moleculeas the intensification searches or the diversification searches[25] to optimize the whole solution

On-wall ineffective collision (as an intensificationsearch) decomposition (as a diversification search) andsynthesis (as a diversification search) are as presented in [30]and we do not repeat them here to focus on our main workIn [30] the function of the ineffective collision operator issimilar to that of the on-wall ineffective collision operatorTherefore different from [30] amodified ineffective collisionoperator is proposed in this paper and it utilized the loadbalance neighborhood structure used in the VNS mentioned

Mathematical Problems in Engineering 11

(1) for each processor pi in the solution 120596 do(2) Compute Tend(pi)(3) end for(4) Choose the processor 119901min with the smallest Tend(119901min)(5) Set the candidate set cand empty(6) for each task vi in the set exec(119901min) do(7) Compute the set predecessor(vi)(8) Update predecessor(vi) with predecessor(vi) = predecessor(vi) minus exec(119901min)(9) cand = cand + predecessor(vi)(10) end for(11) Randomly choose a task Vrandom from cand(12) Reallocate Vrandom to the processor 119901min(13) Encode and reschedule the changed solution 1205961015840(14) return 1205961015840

Algorithm 6 GenNeighborhoodofCommReduction(120596)

(1) choose randomly a tuple (vi f i pi) inm1where f i = 0

(2) exchange the positions of (vi f i pi) and (V119894minus1119891119894minus1 119901119894minus1)

(3) modify 119891119894minus1 f i and 119891

119894+1inm1as defined in the last paragraph of Section 421

(4) generate a new moleculem10158401= GenLBNeighborhood(m

1)

(5) choose randomly a tuple (vi f i pi) inm2where f j = 0

(6) exchange the positions of (vi f i pi) and (V119895minus1

119891119895minus1

119901119895minus1

)(7) modify 119891

119895minus1 f j and 119891

119895+1inm2as defined in the last paragraph of Section 421

(8) generate a new moleculem10158402= GenLBNeighborhood(m

2)

Algorithm 7 IntermoleGO(m1m2)

Table 3 Execution cost of DAG tasks by each processor

Tasks 1199011

1199012

1199013

v1

7 8 9v2

12 14 16v3

14 15 16v4

13 3 14

before to promote the intensification capability of DRSCROand avoid the function duplication

The operator IntermoleGO (ie the ineffective collisionoperator) is used to generate new molecules m1015840

1and m1015840

2

from given moleculesm1andm

2 This operator first uses the

steps in OnWallGO to generate m10158401from m

1 and then the

operator generate the other newmoleculem10158402fromm

2in the

similar fashion In the end the operator generates two newmoleculesm1015840

1andm1015840

2fromm

1andm

2as an intensification

searchThe detailed executions are presented in Algorithm 7Figure 7 shows the example of the IntermoleGO in which themolecules correspond to the DAG as shown in Figure 1(a)

45 Illustrative Example Consider the example shown inFigure 1(a) Its edges are labeled with the communicationcosts whereas the execution costs are shown in Table 3

Initially the path (V1 V2 V4 V3) is found based on

the upward rank value of each task in DAG and the firstmolecule m = ((V

1 0 119901

1) (V2 1 1199011) (V4 0 119901

1) and (V

3

1 1199011)) can be obtained Algorithm 2 InitMoleSMS is then

executed to generate the initial population with 10 elements(ie PopSize is set as 10) for super molecule selection phaseThe initial population molecules are operated during theiterations in the super molecule selection phase as presentedin the framework of DRSCRO in Section 41 and the supermolecule SMole = ((V

1 0 1199013) (V2 1 1199011) (V4 0 1199012) and (V

3

1 1199011)) can be obtainedIn the global optimization phase Algorithm 4 InitMole-

GOVNS is then executed to generate (or to update) the initialpopulation after each iteration as presented in Section 41The molecules are operated during the iterations in theglobal optimization phase as presented in the framework ofDRSCRO in Section 41 and the global minimalmakespan =40 is finally obtained for which the corresponding solution(ie molecule) is ((V

1 0 1199012) (V4 1 1199012) (V2 0 1199012) and (V

3 1

1199012))

46 Analysis of DRSCRO As a new metaheuristic strategythe CRO-based methods for DAG scheduling which isproposed very recently have demonstrated the capability forsolving this kind of NP-hard optimization problems By ana-lyzing the framework molecular structure chemical reactionoperators and the operational environment in DRSCRO itcan be shown to some extent that DRSCRO scheme has theadvantage of three points in comparison with other CRO-based algorithms for DAG scheduling

12 Mathematical Problems in Engineering

New molecule 1

New molecule 2

Molecule 1

Molecule 2

(1 0 p2) (4 1 p2) (2 0 p3) (3 0 p2)

(3 0 p2)(4 1 p2)(2 1 p2)

(1 0 p1)

(1 0 p2)

(1 0 p1) (4 1 p1) (2 0 p1)

(4 0 p2)(2 1 p2)

(3 1 p1)

(3 0 p1)

Figure 7 Example of IntermoleGO

First to some degree super molecule in DRSCRO issimilar to InitS in TMSCRO [30] or the ldquoeliterdquo in GA[31] However the ldquoeliterdquo in GA is usually generated fromtwo chromosomes while super molecule is approached byexecuting the first phase of DRSCRO Moreover in compari-son with TMSCRO DRSCRO uses a metaheuristic strategy(CRO) to get a better super molecule It is because asintelligent random-search algorithm CRO used in the phaseof DRSCRO for super molecule selection searches a widerarea of the solution space than CEFT applied in TMSCROwhich narrow the search down to a very small portion ofthe solution space As a result a better super molecule maycontribute to a better global optimum solution and accel-erates convergence Second DAG scheduling problem hastwo complex aspects including task sequence optimizationand processor assignment optimization which lead to avery large and complicated solution space So for a bettercapability of intensification search than other CRO-basedalgorithms for DAG scheduling on heterogeneous systemsDRSCRO applied VNS algorithm as the initialization of theglobal optimization phase which is also as a local searchprocess during the running of DRSCRO and one of theneighborhood structures of VNS is also utilized in theineffective reaction operator Moreover during the runningof DRSCRO the task priority is changeable in our adoptedVNS algorithm and a new model for processor selection isalso utilized in the neighborhood structures for promotingefficiency of VNS different from the VNS proposed in [22]All of three advantages as previously mentioned enhance theability to get better rapidity of convergence and better searchresult in the whole solution space which is demonstrated bythe experimental results in Sections 53 and 54 The time

complexity of DRSCRO is 119874(iter times (2 times |119881| + 4 times |119864| times |119875| +

119889max times 119904119906119887119901119900119901119873119906119898))

5 Experimental Details

In this section the simulation experiment and comparativeevaluation of HEFT DMSCRO TMSCRO and proposedDRSCRO are presented As presented in [27 30] by theoryanalysis and experimental results TMSCRO and DMSCROproved to have better performance than GA therefore ourwork is the further study of CRO-based algorithms for DAGscheduling on heterogeneous systems and for DRSCRO asa metaheuristic algorithm we focus on the performance ofour proposed algorithm itself and the comparison betweenDRSCRO and other similar kinds of algorithms

First two extensive sets of graphs as the test beds forcomparative study are described Next the parameter settingswhich are used in the simulation experiments are pre-sented The results and analysis of the experiment includingmakespan test and convergence rate test are given in the finalpart

51 Test Bed As presented in [27 30] two extensive setsof DAGs real-world application and randomly generatedapplication graphs are considered as the test beds in theexperiments to enhance the comparability of various algo-rithms The first extensive test bed is two real-world problemDAGs molecular dynamics code [38] and Gaussian elimi-nation [8] Molecular dynamics are a computer simulationof physical movements of the molecules and atoms whichare allowed to interact for a period of time in the contextof N-body simulation Molecular dynamics code DAG isshown in Figure 8 Gaussian elimination is used to calculatethe solution for a linear equation system which is appliedsystematically to convert row operations on a set of linearequations to the upper triangular form As shown in Figure 9the total number of tasks in the Gaussian elimination DAGwith the matrix size of 7 is 27 and the largest task numberat the same level is 6 The reason of the utilization of thesetwo application graphs as a test bed is not only to enhancethe comparability of various algorithms but also to showthe function application of our proposed algorithm as anillustrative demonstration without loss of generality Thesecond extensive test bed for comparative study is the DAGsof random graphs A random graph generator presentedin [39] is implemented to generate random graphs in thesimulation experiment It allows the user to generate a varietyof random graphs with different characteristics such as CCRthe amount of calculation of a task the successor numberof a task and the total number of tasks in a random graphIt is also assumed that all tasks and communication linkshave the same computation cost and communication costrespectively

As shown in Figure 2 the next phase criteria and stoppingcriteria of DRSCRO are that the makespan stays unchangedfor 5000 consecutive iterations in the search loop And thestopping criterion of TMSCRO and DMSCRO is that themakespan remains the same for 10000 consecutive iterations

Mathematical Problems in Engineering 13

Entry

Exit

40

38

36

39

37

3433

27 28

2120

35

31 323029

22 23 24 25 26

321

4 5 6 7

131211 14

8 9 10

15 16 17 18 19

0

Figure 8 A molecular dynamics code DAG

52 Parameter Setting In the experiments a parameter hlis set to represent the heterogeneity level as presented inthe first paragraph of Section 3 It complies with the MHMmodel assumption and results in the fact that speeds of acomputing processor are different for different tasks In doingso the heterogeneity level (1 + hl)(1 minus hl) is equal to thebiggest possible ratio of the best processor speed to the worstprocessor speed for each task hl is set as the value tomake theheterogeneity level 2 unless otherwise specified in this paper

The details of parameter setting are shown in Table 4Theparameters 6ndash12 which are the CRO-based algorithms testedin the simulation are set as presented in [25]

53 Makespan Tests The performance of the proposed algo-rithm is compared with two state-of-the-art CRO-basedscheduling algorithms DMSCRO and TMSCRO and aheuristic algorithm HEFT Each makespan value plotted inthe graphs is the average value of a number of independentruns In the first extensive test bed themakespan is averagedover 10 independent runs (HEFT is run only once as adeterministic algorithm) while in the second extensive test

Exit

0

1 2 3 4 5 6

7

8

13

14

18

19

22

23

25

26

24

20 21

15 16 17

9 10 11 12

Entry

Figure 9 A Gaussian elimination DAG for a matrix of size 7

Table 4 Parameter values for simulation experiment

Parameter ValueCCR 01 02 1 2 5Number of processors 4 8 16 32

hl 0333Successor number of a task in a random graph 1 2 3 4

Total number of tasks in a random graph 10 20 50

InitialKE 1000120579 500120599 10Buffer 200KELossRate 02MoleColl 02PopSize 10

bed the makespan is averaged over 30 different randomgraph running instances Moreover to prove the robustness

14 Mathematical Problems in Engineering

0

20

40

60

80

100

120

140

CCR = 10

Mak

espa

n

HEFTDMSCRO

TMSCRODRSCRO

|P| = 4 |P| = 8 |P| = 16 |P| = 32

Figure 10 Average makespan for the molecular dynamics codeCCR = 10

0

20

40

60

80

100

120

140

CCR = 02

Mak

espa

n

HEFTDMSCRO

TMSCRODRSCRO

|P| = 4 |P| = 8 |P| = 16 |P| = 32

Figure 11 Average makespan for Gaussian elimination CCR = 02

of DRSCRO the best final value achieved in all these runsthe worst final value and the related standard deviation orvariance are also presented

531 Real-World Application Graphs Figures 10ndash13 showthe simulation experiment results of DRSCRO DMSCROTMSCRO and HEFT on the real-world application graphsand Tables 4ndash7 list the detail of the experimental results

As shown in Figures 10 and 11 it can be observed thatthe average makespan decreases as the processor numberincreases The results also show that DRSCRO TMSCROand DMSCRO achieve very similar performance which areall metaheuristic methods It is because according to No-Free-Lunch Theorem all well-designed metaheuristic meth-ods have the same performance on searching for optimalsolutions when averaged over all possible fitness functionsThe TMSCRO andDMSCROused in the simulation are well-designed and taken from the literature Therefore it provedthat DRSCRO developed in our work is also well-designed

A close observation of the results in Tables 10 and 11shows that DRSCRO outperforms TMSCRO and DMSCROon average slightly The reason is only because DRSCRO hasbetter capability of intensification search by applying VNS

01 02 1 2 50

50100150200250300350400450

CCR

Mak

espa

n

HEFTDMSCRO

TMSCRODRSCRO

Figure 12 Average makespan for the molecular dynamics code theprocessors number is 16

01 02 1 2 50

50100150200250300350400450

CCR

Mak

espa

n

HEFTDMSCRO

TMSCRODRSCRO

Figure 13 Average makespan for Gaussian elimination the proces-sors number is 8

and the utilization of one of its neighborhood structuresin the ineffective reaction operator as presented in the lastparagraph of Section 46 Therefore the performance of theaverage results obtained by DRSCRO is better than thatobtained by TMSCRO and DMSCRO when the stoppingcriterion is satisfied Moreover DRSCRO TMSCRO andDMSCRO typically outperform HEFT because they search awider area of the solution space as metaheuristic methodswhile the search of HEFT is narrowed down to a very smallerportion by means of the heuristics

Figures 12 and 13 and the results in Tables 7 and 8show the performance of the experimental results of thesefour algorithms with CCR value increasing It can be seenthat the makespan on average increases with the CCR valueincreasing It is because the heterogeneous processors are inthe idle state for longer as a result of the DAGs becomingmore communication-intensive It also can be observed thatDRSCRO TMSCRO and DMSCRO outperform HEFT andthe advantage becomes more significant with the value ofCCR increasing which suggest that heuristic algorithm likeHEFT has less consistent performance in a wide scheduling

Mathematical Problems in Engineering 15

Table 5 Experiment results for the molecular dynamics code CCR = 10

The number ofprocessors

HEFT(averagemakespan)

DMSCRO(averagemakespan)

TMSCRO(averagemakespan)

DRSCRO(averagemakespan)

DRSCRO(best

makespan)

DRSCRO(worst

makespan)

DRSCRO(variance)

4 120086 117624 116993 116759 114365 119589 24878 120074 116633 116007 115775 113402 118582 246616 120031 116101 115478 115247 112885 118041 245532 119993 115476 114856 114529 112182 117306 2440

Table 6 Experiment results for Gaussian elimination graph CCR = 02

The number ofprocessors

HEFT(averagemakespan)

DMSCRO(averagemakespan)

TMSCRO(averagemakespan)

DRSCRO(averagemakespan)

DRSCRO(best

makespan)

DRSCRO(worst

makespan)

DRSCRO(variance)

4 126852 124252 123585 123337 122042 126327 17698 96952 94965 94455 94266 92333 96551 200816 82356 80668 80235 80074 78433 82015 170632 75630 74080 73682 73535 72027 75317 1566

Table 7 Experiment results for the molecular dynamics code the processors number is 16

The value ofCCR

HEFT(averagemakespan)

DMSCRO(averagemakespan)

TMSCRO(averagemakespan)

DRSCRO(averagemakespan)

DRSCRO(best

makespan)

DRSCRO(worst

makespan)

DRSCRO(variance)

01 111782 109491 108903 108685 106457 111320 231502 112034 109737 109148 108930 106697 111571 23201 120031 116101 115478 115247 112885 118041 24552 160760 157465 156619 156306 153102 158532 30345 414581 406082 403902 403095 400878 404805 2125

Table 8 Experiment results for Gaussian elimination graph the processors number is 8

The value ofCCR

HEFT(averagemakespan)

DMSCRO(averagemakespan)

TMSCRO(averagemakespan)

DRSCRO(averagemakespan)

DRSCRO(best

makespan)

DRSCRO(worst

makespan)

DRSCRO(variance)

01 96612 94632 94124 93935 92010 96212 200102 96952 94965 94455 94266 92333 96551 20081 131094 128407 127717 127462 124849 130552 27152 222635 218071 216900 216467 214194 219550 24565 403600 395327 393204 392418 390260 394083 2069

scenario range and metaheuristic algorithm performs moreeffectively for communication-intensive DAGs

532 Randomly Generated Application Graphs As shownin Figures 14ndash16 randomly generated DAGs are used toevaluate the performance of DRSCRO TMSCRO DMSCROand HEFT in these experiments And the details of theseexperimental results are listed in Tables 9ndash11

Figure 14 shows the performance on the experimentalresults of these four algorithms with the processor numberincreasing As shown in Figure 14 DRSCRO always out-performs TMSCRO DMSCRO and HEFT as the numberof processors increases Figure 15 shows that DMSCRO has

better performance than the other three algorithms as thetask number increases The reasons for these are similarto those explained in the third paragraph of Section 531Figure 16 shows the makespan on average with CCR valuesincreasing It can be seem that the averagemakespan increasesrapidly with the increasing of the value of CCR As shown inFigure 16 the makespan on average increases rapidly whenthe value of CCR rises It is the fact that the DAG becomesmore communication-intensive with CCR increasing whichleads to the processors staying in the idle state for longer

54 Convergence Tests In this section the convergenceexperiments are conducted to show the change of makespan

16 Mathematical Problems in Engineering

Table 9 Experiment results for random graphs under different processor numbers task number is 50 and CCR = 02

The number ofprocessors

HEFT(averagemakespan)

DMSCRO(averagemakespan)

TMSCRO(averagemakespan)

DRSCRO(averagemakespan)

DRSCRO(best

makespan)

DRSCRO(worst

makespan)

DRSCRO(variance)

4 149400 146337 145552 145261 142283 148782 30948 119511 117061 116433 116200 113818 119017 247516 119473 117024 116396 116163 113782 118979 247432 119468 115550 114929 114700 112348 117480 2443

Table 10 Experiment results for random graphs under different task numbers processors number is 32 and CCR = 10

The numberof tasks

HEFT(averagemakespan)

DMSCRO(averagemakespan)

TMSCRO(averagemakespan)

DRSCRO(averagemakespan)

DRSCRO(best

makespan)

DRSCRO(worst

makespan)

DRSCRO(variance)

10 714484 706149 699100 690708 688982 693638 202520 1100918 1089685 1080428 1069082 1066410 1071479 261950 1666373 1662543 1649988 1634230 1633415 1636261 1165

Table 11 Experiment results of DRSCRO for the random graph under different CCRs the task number is 50

Value of CCR Processors number is 4 Processors number is 8 Processors number is 16 Processors number is 3201 145093 116068 116058 11459202 145261 116200 116163 1147001 153248 120228 119511 1185342 194401 166014 164246 1622925 508759 498427 492049 487830

0

50

100

150

CCR = 02

Mak

espa

n

HEFTDMSCRO

TMSCRODRSCRO

|P| = 4 |P| = 8 |P| = 16 |P| = 32

Figure 14 Average makespan for random graphs under differentprocessor numbers task number is 50 and CCR = 02

among DRSCRO TMSCRO and DMSCRO The conver-gence traces and significant tests are to further reveal thedifferences between DRSCRO and the other two algorithmsIn these experiments as suggested in [27] the stoppingcriteria of these three algorithms are that the total runningtime reaches a setting value (eg 180 s) Under the consid-eration of comparability the beginning of the time countingof DRSCRO is set as the start of the global optimizationphase processing In the first extensive test bed themakespanis averaged over 10 independent runs while in the secondextensive test bed the makespan is averaged over 30 differentrandom graph running instances

10 20 500

200400600800

10001200140016001800

The number of tasks

Mak

espa

n

HEFTDMSCRO

TMSCRODRSCRO

Figure 15 Average makespan for random graphs under differenttask numbers processors number is 32 and CCR = 10

541 Convergence Trace The convergence traces ofDRSCRO TMSCRO and DMSCRO for processing themolecular dynamics code and Gaussian elimination areplotted in Figures 17 and 18 respectively Figures 19ndash21show the convergence traces when processing the randomlygenerated DAG sets of which each contains 10 20 and50 tasks respectively As shown in Figures 17ndash21 it canbe observed that the convergence traces of these threealgorithms have obvious differences And the DRSCROconverges faster than the other two algorithms in every caseThe reason for the better rate of convergence of DRSCRO is as

Mathematical Problems in Engineering 17

01 02 1 2 50

100

200

300

400

500

600

CCR

Mak

espa

n

P = 32

P = 16P = 8P = 4

Figure 16 Average makespan of DRSCRO for the random graphunder different CCRs the task number is 50

0 1 2 3 4 5 6 7 8 9134

135

136

137

138

139

140

141

Time (Ms)

Mak

espa

n

DMSCROTMSCRO

DRSCRO

times104

Figure 17 Convergence trace for the molecular dynamics codeCCR = 1 and the number of processors is 16

Time (Ms)0 1 2 3 4 5 6 7 8 9

167

168

169

170

171

172

173

174

Mak

espa

n

DMSCROTMSCRO

DRSCRO

times104

Figure 18 Convergence trace for Gaussian elimination CCR = 02and the number of processors is 8

presented in the last paragraph of Section 46 (ie DRSCROtakes the advantage of its double-reaction structure toobtain a better super molecule for accelerating convergence)Even though the VNS algorithm adds the time cost in eachiteration the enhanced optimization capability of DRSCROalso makes it obtain a better coverage rate than TMSCRO

Time (Ms)0 1 2 3 4 5 6 7

505

51

515

52

525

53

535

54

Mak

espa

n

DMSCROTMSCRO

DRSCRO

times104

Figure 19 Convergence trace for the set of the randomly generatedDAGs with 10 tasks

Time (Ms)0 2 4 6 8 10 12 14

65566

66567

67568

68569

69570

705

Mak

espa

n

DMSCROTMSCRO

DRSCRO

times104

Figure 20 Convergence trace for the set of the randomly generatedDAGs with 20 tasks

Time (Ms)0 05 1 15 2 25 3 35 4 45

179

180

181

182

183

184

185

186

Mak

espa

n

DMSCROTMSCRO

DRSCRO

times105

Figure 21 Convergence trace for the set of the randomly generatedDAGs with 50 tasks

and DMSCRO The simulation experimental results showthat DRSCRO converges faster than TMSCRO by 194 onaverage (by 293 in the best case) and faster than DMSCROby 339 on average (by 412 in the best case)

Moreover the statistical analysis based on the averagevalues achieved is also presented in Section 542 to prove

18 Mathematical Problems in Engineering

Table 12 Results of Friedman tests 120572 = 005

Method Algorithm 119901 value Hypothesis

Friedman testDRSCRO DMSCRO 253119864 minus 02 RejectedDRSCRO TMSCRO 253119864 minus 02 Rejected

DRSCRO DMSCRO TMSCRO 670119864 minus 03 Rejected

Table 13 Results of Quade tests 120572 = 005

Method Algorithm 119901 value Hypothesis

Quade testDRSCRO DMSCRO 132119864 minus 02 RejectedDRSCRO TMSCRO 132119864 minus 02 Rejected

DRSCRO DMSCRO TMSCRO 109119864 minus 03 Rejected

that DRSCRO outperforms the other CRO-based algorithmsfor DAG scheduling from a statistical point of view

542 Significant Tests Statistical analysis is necessary for theaverage coverage rates obtained in all cases by DRSCROTMSCRO and DMSCRO which are metaheuristic methodsin order to find significant differences among these resultsNonparametric tests according to the recommendationsin [40] are specifically considered to be used since theexperimental results may present neither normal distributionnor variance homogeneity Therefore the Friedman test andthe Quade test are applied to check whether significantdifferences exist in the performance between these threealgorithms A significance level 120572 = 005 is used in allstatistical tests

Tables 12 and 13 respectively list the test results of theFriedman test and the Quade test which both reject the nullhypothesis of equivalent performance In both of these twotests our proposedDRSCRO is not only compared against allthe algorithms but also compared against the remaining onesas the control methodThe results in Tables 12 and 13 validatethe significant differences 120572 in the performance of DRSCROTMSCRO and DMSCRO

In sum it could be concluded that DRSCRO which is thecontrol algorithm statistically outperforms the other CRO-based DAG scheduling algorithm on coverage rate with asignificant level of 005

6 Discussion

The experimental results of makespan tests show that theperformance of DRSCRO is very similar to the other similarkinds of metaheuristic algorithms because when averagedover all possible fitness functions each well-designed meta-heuristic algorithm has the same performance for searchingoptimal solutions according to No-Free-Lunch TheoremHowever the proposed DRSCRO can achieve better perfor-mance and find good solutions faster than the other similarkinds of metaheuristic algorithms as the experimental resultsof convergence tests and the reason for it as the analysis inthe last paragraph in Section 46 is that DRSCROhas a bettersuper molecule creation bymetaheuristic method and underthe consideration of the optimization of scheduling orderand processor assignment DRSCRO takes the advantages of

VNS algorithm in the global optimization phase to improvethe optimization capability A load balance neighborhoodstructure is also applied in the ineffective reaction operatorfor a better intensification capability The new processorselection model utilized in the neighborhood structures alsopromotes the efficiency of VNS algorithm

7 Conclusion and Future Study

An algorithm named Double-Reaction-Structured CRO(DRSCRO) is developed for DAG scheduling on heteroge-neous systems in this paper DRSCRO includes two reactionphases one for super molecule selection and another forglobal optimizationThe phase of super molecule selection isused to obtain a super molecule by the metaheuristic methodfor better convergence rate different from other CRO-basedalgorithms forDAG scheduling on heterogeneous systems Inaddition to promote the intersection capability of DRSCROthe VNS algorithm which is with a new model for processorselection utilized in the neighborhood structures is used asthe initialization of global optimization phase and the loadbalance neighborhood structure of VNS is also applied inthe ineffective reaction operator The experimental resultsshow that DRSCRO can also achieve a higher speedup thanthe other CRO-based algorithms as far as we know AndDRSCRO algorithm can also obtain better performance onaveragemakespan in some cases

In future work we will analyze the parameter sensitivityof DRSCRO for promoting its activeness Moreover to makethe proposed algorithmmore practical DRSCROwill be alsoextended to aim at twomain objectives such as (1) minimiza-tion of schedule length (time domain) and (2) minimizationof number of used processors (resource domain)

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

This work is financially supported by the National NaturalScience Foundation of China (Grant no 61462073) and

Mathematical Problems in Engineering 19

the National High Technology Research and DevelopmentProgramof China (863 Program) (Grant no 2015AA020107)

References

[1] T Lewis and H Elrewini Introduction to Parallel ComputingPrentice-Hall New York NY USA 1992

[2] Y-K Kwok and I Ahmad ldquoStatic scheduling algorithms forallocating directed task graphs to multiprocessorsrdquo ACM Com-puting Surveys vol 31 no 4 pp 406ndash471 1999

[3] H Topcuoglu S Hariri and M-Y Wu ldquoPerformance-effectiveand low-complexity task scheduling for heterogeneous comput-ingrdquo IEEE Transactions on Parallel and Distributed Systems vol13 no 3 pp 260ndash274 2002

[4] Y-K Kwok and I Ahmad ldquoDynamic critical-path schedulingan effective technique for allocating task graphs to multiproces-sorsrdquo IEEETransactions on Parallel andDistributed Systems vol7 no 5 pp 506ndash521 1996

[5] F Suter F Desprez and H Casanova ldquoFrom heterogeneoustask scheduling to heterogeneous mixed parallel schedulingrdquo inEuro-Par 2004 Parallel Processing 10th International Euro-ParConference Pisa Italy August 31- September 3 2004 Proceed-ings vol 3149 of Lecture Notes in Computer Science pp 230ndash237Springer Berlin Germany 2004

[6] C-Y Lee J J Hwang Y-C Chow and F D Anger ldquoMultipro-cessor scheduling with interprocessor communication delaysrdquoOperations Research Letters vol 7 no 3 pp 141ndash147 1988

[7] M Maheswaran and H J Siegel ldquoA dynamic matching andscheduling algorithm for heterogeneous computing systemsrdquo inProceedings of the 7th Heterogeneous Computing Workshop pp57ndash69 IEEE Computer Society Orlando Fla USAMarch 1998

[8] M-Y Wu and D D Gajski ldquoHypertool a programming aidformessage-passing systemsrdquo IEEE Transactions on Parallel andDistributed Systems vol 1 no 3 pp 330ndash343 1990

[9] H El-Rewini and T G Lewis ldquoScheduling parallel programtasks onto arbitrary target machinesrdquo Journal of Parallel andDistributed Computing vol 9 no 2 pp 138ndash153 1990

[10] G C Sih and E A Lee ldquoCompile-time scheduling heuristic forinterconnection-constrained heterogeneous processor archi-tecturesrdquo IEEE Transactions on Parallel andDistributed Systemsvol 4 no 2 pp 175ndash187 1993

[11] B Kruatrachue and T Lewis ldquoGrain size determination forparallel processingrdquo IEEE Software vol 5 no 1 pp 23ndash32 1988

[12] M A Al-Mouhamed ldquoLower bound on the number of proces-sors and time for scheduling precedence graphs with communi-cation costsrdquo IEEETransactions on Software Engineering vol 16no 12 pp 1390ndash1401 1990

[13] A Tumeo C Pilato F Ferrandi D Sciuto and P L LanzildquoAnt colony optimization for mapping and scheduling in het-erogeneous multiprocessor systemsrdquo in Proceedings of the Inter-national Conference on Embedded Computer Systems Architec-turesModeling and Simulation (SAMOS rsquo08) pp 142ndash149 IEEEComputer Society Samos Greece June 2008

[14] I Ahmad and M K Dhodhi ldquoMultiprocessor scheduling in agenetic paradigmrdquo Parallel Computing vol 22 no 3 pp 395ndash406 1996

[15] M R Bonyadi and M E Moghaddam ldquoA bipartite geneticalgorithm for multi-processor task schedulingrdquo InternationalJournal of Parallel Programming vol 37 no 5 pp 462ndash487 2009

[16] R C Correa A Ferreira and P Rebreyend ldquoScheduling multi-processor tasks with genetic algorithmsrdquo IEEE Transactions onParallel andDistributed Systems vol 10 no 8 pp 825ndash837 1999

[17] E S H Hou N Ansari and H Ren ldquoGenetic algorithm formultiprocessor schedulingrdquo IEEE Transactions on Parallel andDistributed Systems vol 5 no 2 pp 113ndash120 1994

[18] F A Omara and M M Arafa ldquoGenetic algorithms for taskscheduling problemrdquo Journal of Parallel and Distributed Com-puting vol 70 no 1 pp 13ndash22 2010

[19] A Singh M Sevaux and A Rossi ldquoA hybrid grouping geneticalgorithm for multiprocessor schedulingrdquo in ContemporaryComputing Second International Conference IC3 2009 NoidaIndia August 17ndash19 2009 Proceedings vol 40 of Communica-tions in Computer and Information Science pp 1ndash7 SpringerBerlin Germany 2009

[20] A S Wu H Yu S Jin K-C Lin and G Schiavone ldquoAnincremental genetic algorithm approach to multiprocessorschedulingrdquo IEEE Transactions on Parallel and DistributedSystems vol 15 no 9 pp 824ndash834 2004

[21] Y Wen H Xu and J Yang ldquoA heuristic-based hybrid genetic-variable neighborhood search algorithm for task schedulingin heterogeneous multiprocessor systemrdquo Information Sciencesvol 181 no 3 pp 567ndash581 2011

[22] R Shanmugapriya S Padmavathi and S Shalinie ldquoContentionawareness in task scheduling using Tabu searchrdquo in Proceedingsof the IEEE International Advance Computing Conference pp272ndash277 IEEE Computer Society Patiala India March 2009

[23] S C Porto and C C Ribeiro ldquoA tabu search approach totask scheduling on heterogeneous processors under precedenceconstraintsrdquo International Journal of High Speed Computing vol7 no 1 pp 45ndash72 1995

[24] A V Kalashnikov and V A Kostenko ldquoA parallel algorithm ofsimulated annealing for multiprocessor schedulingrdquo Journal ofComputer and Systems Sciences International vol 47 no 3 pp455ndash463 2008

[25] A Y S Lam and V O K Li ldquoChemical-reaction-inspiredmeta-heuristic for optimizationrdquo IEEE Transactions on EvolutionaryComputation vol 14 no 3 pp 381ndash399 2010

[26] P Choudhury R Kumar and P P Chakrabarti ldquoHybridscheduling of dynamic task graphs with selective duplicationfor multiprocessors under memory and time constraintsrdquo IEEETransactions on Parallel and Distributed Systems vol 19 no 7pp 967ndash980 2008

[27] Y Xu K Li LHe andT K Truong ldquoADAG scheduling schemeon heterogeneous computing systems using double molecularstructure-based chemical reaction optimizationrdquo Journal ofParallel andDistributed Computing vol 73 no 9 pp 1306ndash13222013

[28] J Xu and A Lam ldquoChemical reaction optimization for the gridscheduling problemrdquo in Proceedings of the IEEE InternationalConference on Communications pp 1ndash5 IEEE Computer Soci-ety Cape Town South Africa May 2010

[29] B Varghese G McKee and V Alexandrov ldquoCan agent intel-ligence be used to achieve fault tolerant parallel computingsystemsrdquo Parallel Processing Letters vol 21 no 4 pp 379ndash3962011

[30] Y Jiang Z Shao and Y Guo ldquoA DAG scheduling scheme onheterogeneous computing systems using tuple-based chemicalreaction optimizationrdquo Scientific World Journal vol 2014 Arti-cle ID 404375 23 pages 2014

[31] J Xu AY S Lam and V O K Li ldquoStock portfolio selectionusing chemical reaction optimizationrdquo in Proceedings of theInternational Conference on Operations Research and FinancialEngineering pp 458ndash463 WASET Paris France June 2011

20 Mathematical Problems in Engineering

[32] N Mladenovic and P Hansen ldquoVariable neighborhood searchrdquoComputers amp Operations Research vol 24 no 11 pp 1097ndash11001997

[33] K Li X Tang and K Li ldquoEnergy-efficient stochastic taskscheduling on heterogeneous computing systemsrdquo IEEE Trans-actions on Parallel and Distributed Systems vol 25 no 11 pp2867ndash2876 2014

[34] D HWolpert andWGMacready ldquoNo free lunch theorems foroptimizationrdquo IEEE Transactions on Evolutionary Computationvol 1 no 1 pp 67ndash82 1997

[35] M A Khan ldquoScheduling for heterogeneous Systems usingconstrained critical pathsrdquo Parallel Computing vol 38 no 4-5pp 175ndash193 2012

[36] P Hansen N Mladenovic and J M Perez ldquoVariable neigh-borhood search methods and applicationsrdquo 4OR-A QuarterlyJournal of Operations Research vol 6 no 4 pp 319ndash360 2008

[37] F Glover and G Kochenberger Handbook of MetaheuristicsSpringer New York NY USA 2003

[38] S Kim and J Browne ldquoA general approach to mapping of par-allel computation upon multiprocessor architecturesrdquo in Pro-ceedings of the International Conference on Parallel Processingpp 1ndash8 Pennsylvania State University Pennsylvania State Uni-versity Press University Park Pa USA August 1988

[39] V A F Almeida I M M Vasconcelos J N C Arabe and D AMenasce ldquoUsing random task graphs to investigate the potentialbenefits of heterogeneity in parallel systemsrdquo in Proceedingsof the ACMIEEE Conference on Supercomputing pp 683ndash691IEEE Computer Society Minneapolis Minn USA November1992

[40] S Garcıa A Fernandez J Luengo and F Herrera ldquoAdvancednonparametric tests for multiple comparisons in the design ofexperiments in computational intelligence and data miningexperimental analysis of powerrdquo Information Sciences vol 180no 10 pp 2044ndash2064 2010

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 3: Research Article DRSCRO: A Metaheuristic Algorithm for ...downloads.hindawi.com/journals/mpe/2015/396582.pdf · Research Article DRSCRO: A Metaheuristic Algorithm for Task Scheduling

Mathematical Problems in Engineering 3

Table 1 Parameters used in CRO

Parameters DefinitionPE Current potential energy of a moleculeKE Current kinetic energy of a moleculeInitialKE Initial kinetic energy of a molecule120579 Threshold value guides the choice of on-wall collision or decomposition120599 Threshold value guides the choice of intermolecule collision or synthesisBuffer Initial energy in the central energy bufferKELossRate Loss rate of kinetic energyMoleColl Threshold value to determine whether to perform a unimolecule reaction or an intermolecule reactionPopSize Size of the moleculesiters Number of iterations

algorithms are the solutions to DAG scheduling problemThe PE value of a molecule is calculated by fitness functionwhich is equal to the objective value makespan of thecorresponding solution And KE is for helping the moleculeescape from local optimums and its value is nonnegativeA buffer is also used in CRO-based algorithms for energyinterchange and conservation Moreover to find the solutionwith the global minimal makespan four types of elementarychemical reactions on-wall ineffective collision decompo-sition intermolecular ineffective collision and synthesisare applied for the intensification and the diversificationsearches The typical execution flow of CRO frameworkadopted in DMSCRO and TMSCRO is as proposed in [25]and the parameters used in CRO are presented in Table 1

As metaheuristic methods DMSCRO and TMSCROhave better performance in terms of schedule quality thanheuristic methods and the reason is as presented in the lastparagraph of Section 21 The experimental results in [27 30]prove that both of DMSCRO and TMSCRO outperform GADMSCRO is the first algorithm by applying CRO proposedby Lam and Li in [25] to solve the DAG scheduling problemand it enjoys the advantages of both GA and SA On theone hand the intermolecular collision and on-wall collisiondesigned in DMSCRO have similar effect to the crossoveroperation and the mutation operation in GA respectivelyOn the other hand the energy conservation requirementin DMSCRO is able to guide the searching of the optimalsolution similarly to the way the Metropolis Algorithm of SAguides the evolution of the solutions in SA Two additionaloperations decomposition and synthesis give DMSCROmore opportunities to jump out of the local optimum andexplore the wider areas in the solution space This benefitenables DMSCRO to find good solutions faster than GAwhich has been widely used to evolve solutions for many taskscheduling problems DMSCRO are not compared with SAin [27 30] because the underlying principles and philoso-phies between DMSCRO and SA differ a lot [27] Typicallymetaheuristic algorithms like CRO-based algorithm of GA-based algorithms operating on a population of solutions areable to find good solutions faster than that operating on asingle solution like SA-based algorithms Comparing withDMSCRO TMSCRO applies constrained earliest finish time

algorithm to data pretreatment to take the advantage of thesuper molecule and constrained critical paths [35] whichis as heuristic information for accelerating convergenceMoreover the molecule structure and elementary reactionoperators design in TMSCRO are more reasonable thanthose in DMSCRO on intensification and diversification ofsearching the solution space

However for solving the NP problem of DAG schedul-ing on heterogeneous systems CRO-based algorithmsTMSCRO and DMSCRO still have very large time expen-diture as metaheuristic scheduling strategies therefore thesearching capabilities and convergence rates of them need tobe improved There are three deficiencies of TMSCRO andDMSCRO First in [30] the concept of super molecule isapplied for accelerating convergence and the super moleculeis selected by heuristic scheduling strategies but the perfor-mance of this kind of super molecule selection method isaffected by the range of problems Second in both TMSCROand DMSCRO the initial molecules which are very impor-tant for the whole searching process are randomly createdand the uncertainty of this kind of initialization underminesthe searching capabilities of TMSCRO and DMSCRO More-over the intensification capabilities of CRO-based algorithmsfor DAG scheduling also need to be improved to obtainbetter performances of the average results when the iterationstopping criterions are satisfied

Therefore this paper proposes an algorithm Double-Reaction-Structured CRO (DRSCRO) for DAG taskscheduling on heterogeneous systems to aim at obtainingschedules with better quality In this paper the conventionalCRO framework scheme is modified and two reactionphases one for super molecule selection and another forglobal optimization are developed in DRSCRO CRO as ametaheuristic algorithm is utilized in the molecule selectionphase to obtain a super molecule [31] for better convergencerate Moreover in the global optimization phase the variableneighborhood search (VNS) algorithm method [21 32 36]which is an effective metaheuristic with the utilizationsof neighborhood structures and a local search to changethe neighborhood systematically is used to optimize theinitial molecule and one of its neighborhood structuresis also adopted in the reaction operator to promote the

4 Mathematical Problems in Engineering

intensification capability And there is a newmodel proposedfor processor selection utilized in the neighborhoodstructures of the VNS algorithm for better effectiveness

Moreover in [21] VNS was incorporated with GA forDAG scheduling but the task priority was unchangeable inthe VNS algorithm in [21] which reduces the efficiency ofVNS to obtain a better solution So different from [21] topromote the intensification capability of the whole algorithmthe VNS in DRSCRO is modified under the consideration ofthe optimization of the scheduling order and the processorassignment both

3 Problem Formulation

The DAG scheduling problem is typically with two inputsa heterogeneous system for task computing in parallel anda parallel program of application (ie DAG) In this paperthe heterogeneous system is assumed as a static computingsystem model presented by 119875 = 119901

119894| 119894 = 1 2 3 |119875|

which is a fully connected network of processorsThe hetero-geneity level in this paper is formulated as (1+hl)(1minushl)where the parameter hl isin (0 1) In this paper EcCost

119901119895(V119894)

represents the computation cost of a task V119894mapped to the

processor 119901119895and the value of each EcCost

119901119895(V119894) is randomly

chosen within the scope of [1 minus hl 1 + hl]In general DAG = (119881 119864) consists of a task (node) set

119881 and an edge set 119864 EcCost119901119895

(V119894) is as defined in the first

paragraph of this section and the same processor executesa task in the DAG without preemption The constraintbetween tasks V

119894and V

119895is denoted as the edge 119890

119894119895(119890119894119895

isin

119864) which means that the execution of task V119895only after

the execution result of task V119894has been transmitted to task

V119895 Each edge 119890

119894119895has a nonnegative weight comm(V

119894 V119895)

denoting the communication cost between V119894and V

119895 Each

task in a DAG can only be executed on one processor andthe communication can be performed simultaneously bythe processors In addition when two communicating tasksare mapped to the same processor the communication costof them is zero Predecessor (V

119894) represents the set of the

predecessors of V119894 while successor (V

119894) represents the set of

the successors of V119894 The task with no predecessor is denoted

as Ventry while the task with no successor is denoted as VexitConsider that there is a DAGwith |119881| tasks to be mapped

to a heterogeneous system with |119875| processors Assuming thehighest-priority ready task V

119894on the processor 119901

119895 the earliest

start time of V119894 119879ESTime(V119894 119901119895) can be formulated as

119879ESTime (V119894 119901119895) = max 119879avail (119901

119895) 119879ready (V

119894 119901119895) (1)

where 119879avail(119901119895) can be defined as (2) 119879avail(119901119895) is the timewhen processor 119901

119895is available to the execution of the task V

119894

119879avail (119901119895) = max

V119896isinexec(119901119895)119879AFTime (V

119896) (2)

where exec(119901119895) represents all the tasks which have already

been scheduled on the processor 119901119895and 119879AFTime(V119896) denotes

the actual finish time when the task V119896finishes its execution

119879ready(V119894 119901119895) in (1) represents the time when all the dataneeded for the process of V

119896have been transmitted to 119901

119895

which is formulated as

119879ready (V119894 119901119895)

= maxV119896isinpredecessor(V119894)

119879AFTime (V119896) + comm (V

119896 V119894)

(3)

where 119879AFTime(V119896) has the same definition in (2) and prede-cessor (V

119894) denotes the set of all the immediate predecessors of

task V119894 comm(V

119896 V119894) is 0 if the task V

119896and task V

119894are mapped

to the same processor 119901119895

If task V119894is mapped to the processor 119901

119895with nonpreemp-

tive processing approach the earliest finish time of task V119894

119879EFTime(V119894 119901119895) is formulated as

119879EFTime (V119894 119901119895) = 119879ESTime (V

119894 119901119895) + EcCost

119901119895(V119894) (4)

After the task V119894is executed by the processor 119901

119895

119879EFTime(V119894 119901119895) is assigned to 119879AFTime(V119894) Themakespan of theentire parallel program is equivalent to the actual finish timeof exit task Vexit

119898119886119896119890119904119901119886119899 = maxV119894isin119881

119879AFTime (V119894) = 119879AFTime (Vexit) (5)

The computation of the communication-to-computationratio (CCR) can be formulated as in

CCR =

sum119890119894119895isin119864

comm (V119894 V119895)

sumV119894isin119881119882 (V119894)

(6)

where 119882(V119894) is the average computation cost of task V

119894and it

can be calculated as follows

119882 (V119894) =

|119875|

sum

119896=1

EcCost119901119896

(V119894)

|119875| (7)

A simple four-task DAG and a heterogeneous computa-tion system with three processors are shown in Figures 1(a)and 1(b) respectively The definition of the notations can befound in Table 2

4 Design of DRSCRO

DRSCRO imitates the molecular interactions in chemicalreactions based on the concepts of atoms molecule molec-ular structure and energy of a molecule In DRSCRO amolecule corresponds to a scheduling solution in DAGscheduling with a unique molecular structure representingthe atom positions in a molecule We utilize the molecularstructure of TMSCRO in our work under the considerationof its capability to represent the constrained relationshipbetween the tasks in a molecule (solution) In addition theenergy of each molecule corresponds to the fitness valueof a solution The molecular interactions try to reconstructmore stable molecular structure with lower energy There arefour kinds of basic chemical reactions on-wall ineffective

Mathematical Problems in Engineering 5

Table 2 Definitions of notations

Notations Definitions

DAG = (119881 119864) Input directed acyclic graph with |119881| nodes representing tasks and |119864| edgesrepresenting constrained relations among the tasks

119875 = 119901119894| 119894 = 1 2 3 |119875| Set of heterogeneous processors in target system

EcCost119901119895(V119894) Execution cost of task V

119894using processor 119901

119895

comm(V119896 V119894) Communication cost from task V

119896to task V

119894

119879ESTime(V119894 119901119895) The earliest start time of task V119894which is mapped to processor 119901

119895

119879EFTime(V119894 119901119895) The earliest finish time of task V119894which is mapped to processor 119901

119895

119879avail(119901119895) The time when processor 119901119895is available

119879ready(V119894 119901119895) The time when all the data needed for the process of V119894have been transmitted to 119901

119895

119879AFTime(V119896) Actual finish time when task V119896finishes its execution

predecessor(V119894) Set of the predecessors of task V

119894

successor(V119894) Set of the successors of task V

119894

exec(119901119895) Set of the tasks which have already been scheduled on the processor 119901

119895

119882(V) Average computation cost of task VCCR Communication-to-computation ratiohl Parameter for adjusting the heterogeneity level in a heterogeneous system

Start

End

516

1714

0

0

1(8)

4(10)

3(15)

2(14)

(a)

p2 p3

p1

(b)

Figure 1 (a) DAG model (b) Heterogeneous computation system model

collision decomposition intermolecular ineffective collisionand synthesis for molecular interactions in DRSCRO andeach kind of reaction contains two subclasses These twosubclasses of reaction operators are applied in the phase ofsuper molecule selection and the phase of global optimiza-tion respectively

41 Framework of DRSCRO The framework of DRSCRO toschedule a DAG job is as shown in Figure 2 with two basicphases the phase of supermolecule selection and the phase ofglobal optimization In each phase DRSCRO first initializes

the process of a phase and then the phase process entersiteration

In this framework DRSCRO first executes the phase ofsuper molecule selection to obtain the super molecule (iejust themolecule with the global minimalmakespan) SMolewith other output molecules as the input of the next globaloptimization phase for the first time (the input of VNSalgorithm is the population with SMole after each iterationin the global optimization phase in the other times) andthen DRSCRO performs the phase of global optimization toapproach the global optimum solution The VNS algorithm

6 Mathematical Problems in Engineering

The global min pointYes

Intermolecularcollision

Molecule selection (one is chosen)

Molecule selection (two or more are chosen)

Satisfy the criteria of decomposition for super

molecule

Satisfy the criteria

molecule selection

OnWallSMS DecompSMS IntermoleSMS SynthSMS

Fit evaluation

No Yes

YesNo No Yes

No

Start

Initialization of the reaction for super molecule selection

Next phase criteria matched

Stopping criteria matched

Initialization of the reaction for globaloptimalization (InitMoleGOVNS)

Yes

Intermolecularcollision

Molecule selection (oneis chosen)

Molecule selection (two or more are chosen)

Satisfy the criteria of decomposition for global

optimalization

Satisfy the criteria

OnWallGO DecompGO IntermoleGO SynthGO

Fit evaluation

No Yes

YesNo No Yes

No

selection

of synthesis for super

of synthesis for globaloptimalization

Figure 2 Framework of DRSCRO

Mathematical Problems in Engineering 7

(1)makespan = 0(2) for each node V

119896inm = ((V

1 1198911 1199011) (V2 1198912 1199012) (V

|119881| 119891|119881| 119901|119881|)) do

(3) calculate the actual finish time of V119896(ie 119879AFTime(V119896))

(4) if 119898119886119896119890119904119901119886119899 lt 119879AFTime(V119896)(5) updatemakespan(6) makespan = 119879AFTime(V119896)(7) end if(8) end for(9) returnmakespan

Algorithm 1 Fit(m) calculating the fitness value of a molecule and the processor allocation optimization

with a new model for processor selection is adopted asthe initialization of the global optimization phase and itis also utilized as a local search process to promote theintensification capability of DRSCROThere are four kinds ofelementary chemical reaction in DRSCRO on-wall collisiondecomposition intermolecular collision and synthesis Andeach kind of reaction contains two types of operators whichare respectively utilized in two phases of DRSCRO In eachiteration one of the elementary chemical reaction operatorsis performed to generate new molecules and the PEs of thenewly generated molecules (ie the fitness function valuesof the newly generated molecules) will be calculated Inaddition SMole will be tracked and only participates inon-wall ineffective collision and intermolecular ineffectivecollision in the global optimization phase to explore as muchas possible the solution space in its neighborhoods and themain purpose is to prevent the supermolecule from changingdramatically The iteration of each phase repeats until thestopping criteria (or next phase criteria) are met and SMoleand its fitness function value are just the final solutionand makespan (ie global min point) respectively In theimplementations of the experiments in this paper the nextphase criteria and the stop criteria ofDRSCROare set aswhenthere is no makespan improvement after 10000 consecutiveiterations in the search loop

42 Molecular Structure and Fitness Function This subsec-tion presents the encoding of scheduling solutions (ie themolecular structure) and the statement of the fitness functionin DRSCRO

421 Molecular Structure In this paper an atom with threeelements can be denoted as a tuple (V

119894 119891119894 119901119894) and the molec-

ular structure M with an array of tuples can be formulatedas in (8) to represent a solution to the DAG schedulingproblem The order of the tuples in M represents the priorityof each DAG task V

119894with the allocated processor 119901

119894 and V =

(V1 V2 V

|119881|) is a topological sequence of DAG which is

with the hypothetical entry task (with no predecessors) V1and

exit task (with no successors) V|119881| respectively representing

the beginning and end of execution Moreover if tuple A isbefore tuple B and VA is the predecessor of VB in DAG thesecond integer of tuple B 119891B will be 1 and vice versa

m = ((V1 1198911 1199011) (V2 1198912 1199012) (V

|119881| 119891|119881|

119901|119881|

)) (8)

422 Fitness Function Potential energy (PE) is defined asthe fitness function value of the corresponding solutionrepresented by 119878 The overall schedule length of the entireDAG namely makespan is the largest finish time amongall tasks which is equivalent to the actual finish time ofthe exit node in DAG In this paper the goal of DAGscheduling problem by DRSCRO is to obtain the schedulingthat minimizes makespan and ensure that the precedence ofthe tasks is not violated Hence each fitness function value isdefined as

Fit (m) = PEm = 119898119886119896119890119904119901119886119899 (9)

Algorithm 1 presents how to calculate the value of theoptimization fitness function Fit(m)

43 Super Molecule Selection Phase

431 Initialization There are two kinds of initial moleculegenerator one used in the phase of super molecule selectionand the other used in the phase of global optimization togenerate the initial solutions for DRSCRO tomanipulateThetuples of the first moleculem used in the initialization of thephase of super molecule selection are ascendingly orderedby the upward rank value [27] of their V

119894 and element three

119901119894of each tuple is generated by a random perturbation The

upward rank value can be calculated by

Rank (V119894)

= 119882 (V119894)

+ maxV119895isinsuccessor(V119894)

comm (V119894 V119895) + Rank (V

119895)

(10)

A detailed description of the initial molecule generator ofthe super molecule selection phase is given in Algorithm 2For the first input molecule m 119901

119909in each tuple inm is set as

1199011

432 Elementary Chemical Reaction Operators InDRSCROthe operators for super molecule selection just randomlychange 119901

119894of each tuple in a molecule as the intensification

searches or the diversification searches [25] to optimize theprocessor mapping of a solution Figures 3 4 5 and 6respectively show the examples of four operators for super

8 Mathematical Problems in Engineering

(1) 119872119900119897119890119873 = 1(2) while MoleN le PopSize do(3) for each 119901

119894in moleculem to randomly change

(4) change 119901119894randomly

(5) end for(6) generate a new moleculem1015840(7) MoleN =MoleN + 1(8) end while

Algorithm 2 InitMoleSMS(m) generating the initial populationfor the super molecule selection phase

Molecule

New molecule

(1 0 p1)

(1 0 p1)

(2 1 p2)

(2 1 p2)

(4 0 p2) (3 1 p3)

(4 0 p3) (3 1 p1)

Figure 3 Example of OnWallSMS

New molecule 1

New molecule 2

Molecule

(1 0 p3) (2 1 p2)

(4 0 p3)

(4 0 p1) (3 1 p1)

(3 1 p1)(2 1 p2)(1 0 p1)

(1 0 p1) (2 1 p3) (4 0 p3) (3 1 p2)

Figure 4 Example of DecompSMS

molecule selection in which the molecules correspond to theDAG as shown in Figure 1(a) And the white blocks in theseexamples denote the tuples that do not change during thereaction operation calculations

As shown in Figure 3 the operatorOnWallSMS is used togenerate a new molecule m1015840 from a given reaction moleculem for optimization OnWallSMS works as follows (1) Theoperator randomly chooses a tuple (V

119894 119891119894 119901119894) in m (2)

The operator changes 119901119894randomly In the end the operator

New molecule 1

New molecule 2

Molecule 1

Molecule 2

(1 0 p1)

(1 0 p1)

(1 0 p1)

(1 0 p1)

(2 1 p1)

(2 1 p2)

(2 1 p2)

(2 1 p1)

(3 1 p1)

(3 1 p3)

(4 0 p3)

(4 0 p3)

(4 0 p3)

(4 0 p1)

(3 1 p2)

(3 1 p3)

Figure 5 Example of IntermoleSMS

Molecule 1

Molecule 2

New molecule

(4 0 p3)

(4 0 p3)

(2 1 p1)

(2 1 p2)(1 0 p2)

(1 0 p2)

(1 0 p1) (2 1 p3)

(3 1 p3)

(3 1 p1)

(4 0 p3) (3 1 p1)

Figure 6 Example of SynthSMS

generates a new molecule m1015840 from m as an intensificationsearch

As shown in Figure 4 the operator DecompSMS is usedto generate newmoleculesm1015840

1andm1015840

2from a given reaction

moleculem DecompSMS works as follows (1) The operatorgenerates two molecules m1015840

1= m and m1015840

2= m (2)

The operator keeps the tuples in m10158401 which is at the odd

position in m and then changes the remaining 119901119909rsquos of tuples

inm10158401 randomly (3) The operator retains the tuples in m1015840

2

which is at the even position in m and then changes theremaining 119901

119909rsquos of tuples in m1015840

2randomly In the end the

operator generates two new molecules m10158401and m1015840

2from m

as a diversification search

Mathematical Problems in Engineering 9

(1) tempSet = pop set(2) pop subset = 0(3) if pop set is the input of the VNS algorithm for the first time (ie the output of the super molecule selection phase)(4) for each tempS in tempSet except SMole(5) choose a tuple (V

119894 119891119894 119901119894) in tempS where 119891

119894= 0 randomly

(6) generate a random number 119903119899119889 isin (0 1)(7) if rndge 05(8) find the first predecessor vj = Pred(vi) from vi to the begin in molecule tempS(9) interchanged position of (vi f i pi) and (V

119895+1 119891119895+1

119901119895+1

) in molecule tempS(10) update f i 119891119894+1 and 119891

119895+1as defined in the last paragraph of Section 421

(11) end if(12) for each pi in molecule tempS to randomly change(13) change pi randomly(14) end for(15) if Fit(tempS) lt Fit(SMole)(16) SMole = tempS(17) end if(18) end for(19) end if(20) pop subset adds SMole(21) pop subset adds the molecules in tempSet with tuple order different from SMole(22) while |119901119900119901 119904119906119887119904119890119905| = 119901119900119901 119904119906119887119904119890119905 119899119906119898 do(23) pop subset add a molecules in pop set which do not exist in pop subset(24) end while

Algorithm 3 InitVNS(pop set pop subset num) initializing the subset of the population pop set for undergoing the VNS algorithm

As shown in Figure 5 the operator IntermoleSMS isused to generate new molecules m1015840

1and m1015840

2from given

molecules m1and m

2 This operator first uses the steps in

OnWallSMS to generatem10158401fromm

1 and then the operator

generates the other newmoleculem10158402fromm

2in the similar

fashion In the end the operator generates two newmoleculesm10158401andm1015840

2from m

1and m

2as an intensification search

As shown in Figure 6 the operator SynthSMS is usedto generate a new molecule m1015840 from given molecules m

1

and m2 SynthSMS works as follows The operator keeps the

tuples inm1015840 which is at the same position inm1andm

2with

the same 119901119909rsquos and then changes the remaining 119901

119910rsquos in m1015840

randomly As a result the operator generatesm1015840 fromm1and

m2as a diversification search

44 Global Optimization Phase

441 Initialization VNS is utilized by our proposed algo-rithm as the initialization of the global optimization phaseand it is also as a local search process to promote theintensification capability of DRSCRO during the running ofthe whole algorithm

Algorithms 3 and 4 respectively present the subset gener-ator of the phase output and the main steps of the whole VNSalgorithm (ie the initialization of the global optimizationphase) In DRSCRO the VNS algorithm only processes thesubset of the population with the super molecule SMoleafter each iteration in the global optimization phase (theoutput of super molecule selection phase is the input of VNSfor the first time) As presented in Algorithm 3 if the pop set(ie the set of population) is the output of the super molecule

selection phase the tuple orders and 119901119894s of its elements

will be adjusted pop subset is the subset of population andpop subset num is the number of the elements in pop subsetwhich is set as PopSize times 50 in this paper

In Algorithm 4 different from the VNS proposed in [21]the task priority was changeable in theVNS algorithmused inDRSCRO the reason for which is that the unchangeable taskpriority in the VNS reduces its efficiency to obtain a bettersolutionTherefore under the consideration of the optimiza-tion of the scheduling order and the processor assignmentboth the input molecules of VNS can be with differenttuple order (ie task priority) as presented in Algorithm 3in each iteration 119889max is set to 2 as presented in [37] Asthe essential factor of VNS two neighborhood structuresload balance and communication reduction neighborhoodstructures which demonstrate their power in solving DAGscheduling problem on heterogeneous systems as presentedin [21] are adopted by the VNS algorithm in DRSCRO fortheir high efficiency In this paper a new model is alsoproposed for processor selection of these two neighborhoodstructures As presented in [21] there are two intuitionsused to construct the neighborhood structures One isthat balancing load among various processors usually helpsminimizing the makespan especially when most tasks areallocated to only a few processors the other is that reducingcommunication overhead and idle waiting time of processorsalways results in a more effective schedule especially givena relatively high unit communication However there is acontradiction between these two intuitions because reducingcommunication overhead and idle waiting time of processorsalways means that some processors are with most tasks

10 Mathematical Problems in Engineering

(1) pop subset = InitVNS(pop set)(2) Select the set of neighborhood structures 119873119890119894119892ℎ

119889(119889 = 1 2 3 119889max)

(3) for each individualm in the pop subset do(4) d = 1(5) while 119889 lt 119889max do(6) Randomly generate a moleculem

1from the 119889th neighborhood ofm

(7) Apply some local search method withm1as the initial molecule (the local optimum presented bym

2)

(8) If m2is better thanm

(9) m = m2

(10) 119889 = 1(11) else(12) d = d + 1(13) end if(14) end while(15) until the termination condition is satisfied(16) end for(17) execute the combination strategy

Algorithm 4 InitMoleGOVNS(pop set) generating the initial population of the global optimization phase

(1) for each processor 119901119894in the solution 120596 do

(2) Compute Tend(pi)(3) end for(4) Choose the processor 119901max with the largest Tend(119901max)(5) Randomly choose a task Vrandom from exec(119901max)(6) Randomly choose a processor 119901random different from 119901max(7) Reallocate Vrandom to the processor 119901random(8) Encode and reschedule the changed solution 1205961015840(9) return 1205961015840

Algorithm 5 GenNeighborhoodofLoadBalance(120596)

So different from the original ones in [21] we develop anew model for processor selection Let TCload(119901

119894) be all

the task execution cost of processor 119901119894 and TCcomm(119901

119894)

is the communication cost overhead of processor 119901119894as

defined in [21] The values of TCload(119901119894) and TCcomm(119901

119894)

are the tendencies of load balancing and communicationreducing respectively (ie the tendency of task reducing orincreasing) The greater TCload(119901

119894) is the stronger tendency

of reducing tasks on 119901119894is and the greater TCcomm(119901

119894) is the

stronger tendency of increasing tasks on 119901119894is Therefore a

parameter Tend(119901119894) is developed to measure the tendency

with the combination of TCload(119901119894) and TCcomm(119901

119894) as (11)

The neighborhood structure computation processes of loadbalance and communication reduction are as presented inAlgorithms 5 and 6 respectively The proposed model isunder the comprehensive consideration of both intuitionsand can make the VNS algorithm more effective than theoriginal one

Tend (119901119894) =

1

1 + 119890minus(TCload(119901119894)TCcomm(119901119894)) (11)

The VNS algorithm in DRSCRO utilizes the dual termi-nation criteria The termination criterion 1 sets the upperbound of the local search iterations to 20 and the termination

criterion 2 sets the maximum iteration number withoutimprovement to 3 The VNS algorithm will stop if eithercriterion is satisfied To form a new initial population acombination strategy is utilized for combining the currentpopulation and the VNS output after the VNS algorithmoutputs the subset of the population The current populationand the VNS output are first merged and sorted by increasingmakespan then the first PopSize molecules are selected togenerate the new initial population

442 Elementary Chemical Reaction Operators The opera-tors for global optimization not only vary 119901

119894of each tuple

but also interchange the positions of the tuples in a moleculeas the intensification searches or the diversification searches[25] to optimize the whole solution

On-wall ineffective collision (as an intensificationsearch) decomposition (as a diversification search) andsynthesis (as a diversification search) are as presented in [30]and we do not repeat them here to focus on our main workIn [30] the function of the ineffective collision operator issimilar to that of the on-wall ineffective collision operatorTherefore different from [30] amodified ineffective collisionoperator is proposed in this paper and it utilized the loadbalance neighborhood structure used in the VNS mentioned

Mathematical Problems in Engineering 11

(1) for each processor pi in the solution 120596 do(2) Compute Tend(pi)(3) end for(4) Choose the processor 119901min with the smallest Tend(119901min)(5) Set the candidate set cand empty(6) for each task vi in the set exec(119901min) do(7) Compute the set predecessor(vi)(8) Update predecessor(vi) with predecessor(vi) = predecessor(vi) minus exec(119901min)(9) cand = cand + predecessor(vi)(10) end for(11) Randomly choose a task Vrandom from cand(12) Reallocate Vrandom to the processor 119901min(13) Encode and reschedule the changed solution 1205961015840(14) return 1205961015840

Algorithm 6 GenNeighborhoodofCommReduction(120596)

(1) choose randomly a tuple (vi f i pi) inm1where f i = 0

(2) exchange the positions of (vi f i pi) and (V119894minus1119891119894minus1 119901119894minus1)

(3) modify 119891119894minus1 f i and 119891

119894+1inm1as defined in the last paragraph of Section 421

(4) generate a new moleculem10158401= GenLBNeighborhood(m

1)

(5) choose randomly a tuple (vi f i pi) inm2where f j = 0

(6) exchange the positions of (vi f i pi) and (V119895minus1

119891119895minus1

119901119895minus1

)(7) modify 119891

119895minus1 f j and 119891

119895+1inm2as defined in the last paragraph of Section 421

(8) generate a new moleculem10158402= GenLBNeighborhood(m

2)

Algorithm 7 IntermoleGO(m1m2)

Table 3 Execution cost of DAG tasks by each processor

Tasks 1199011

1199012

1199013

v1

7 8 9v2

12 14 16v3

14 15 16v4

13 3 14

before to promote the intensification capability of DRSCROand avoid the function duplication

The operator IntermoleGO (ie the ineffective collisionoperator) is used to generate new molecules m1015840

1and m1015840

2

from given moleculesm1andm

2 This operator first uses the

steps in OnWallGO to generate m10158401from m

1 and then the

operator generate the other newmoleculem10158402fromm

2in the

similar fashion In the end the operator generates two newmoleculesm1015840

1andm1015840

2fromm

1andm

2as an intensification

searchThe detailed executions are presented in Algorithm 7Figure 7 shows the example of the IntermoleGO in which themolecules correspond to the DAG as shown in Figure 1(a)

45 Illustrative Example Consider the example shown inFigure 1(a) Its edges are labeled with the communicationcosts whereas the execution costs are shown in Table 3

Initially the path (V1 V2 V4 V3) is found based on

the upward rank value of each task in DAG and the firstmolecule m = ((V

1 0 119901

1) (V2 1 1199011) (V4 0 119901

1) and (V

3

1 1199011)) can be obtained Algorithm 2 InitMoleSMS is then

executed to generate the initial population with 10 elements(ie PopSize is set as 10) for super molecule selection phaseThe initial population molecules are operated during theiterations in the super molecule selection phase as presentedin the framework of DRSCRO in Section 41 and the supermolecule SMole = ((V

1 0 1199013) (V2 1 1199011) (V4 0 1199012) and (V

3

1 1199011)) can be obtainedIn the global optimization phase Algorithm 4 InitMole-

GOVNS is then executed to generate (or to update) the initialpopulation after each iteration as presented in Section 41The molecules are operated during the iterations in theglobal optimization phase as presented in the framework ofDRSCRO in Section 41 and the global minimalmakespan =40 is finally obtained for which the corresponding solution(ie molecule) is ((V

1 0 1199012) (V4 1 1199012) (V2 0 1199012) and (V

3 1

1199012))

46 Analysis of DRSCRO As a new metaheuristic strategythe CRO-based methods for DAG scheduling which isproposed very recently have demonstrated the capability forsolving this kind of NP-hard optimization problems By ana-lyzing the framework molecular structure chemical reactionoperators and the operational environment in DRSCRO itcan be shown to some extent that DRSCRO scheme has theadvantage of three points in comparison with other CRO-based algorithms for DAG scheduling

12 Mathematical Problems in Engineering

New molecule 1

New molecule 2

Molecule 1

Molecule 2

(1 0 p2) (4 1 p2) (2 0 p3) (3 0 p2)

(3 0 p2)(4 1 p2)(2 1 p2)

(1 0 p1)

(1 0 p2)

(1 0 p1) (4 1 p1) (2 0 p1)

(4 0 p2)(2 1 p2)

(3 1 p1)

(3 0 p1)

Figure 7 Example of IntermoleGO

First to some degree super molecule in DRSCRO issimilar to InitS in TMSCRO [30] or the ldquoeliterdquo in GA[31] However the ldquoeliterdquo in GA is usually generated fromtwo chromosomes while super molecule is approached byexecuting the first phase of DRSCRO Moreover in compari-son with TMSCRO DRSCRO uses a metaheuristic strategy(CRO) to get a better super molecule It is because asintelligent random-search algorithm CRO used in the phaseof DRSCRO for super molecule selection searches a widerarea of the solution space than CEFT applied in TMSCROwhich narrow the search down to a very small portion ofthe solution space As a result a better super molecule maycontribute to a better global optimum solution and accel-erates convergence Second DAG scheduling problem hastwo complex aspects including task sequence optimizationand processor assignment optimization which lead to avery large and complicated solution space So for a bettercapability of intensification search than other CRO-basedalgorithms for DAG scheduling on heterogeneous systemsDRSCRO applied VNS algorithm as the initialization of theglobal optimization phase which is also as a local searchprocess during the running of DRSCRO and one of theneighborhood structures of VNS is also utilized in theineffective reaction operator Moreover during the runningof DRSCRO the task priority is changeable in our adoptedVNS algorithm and a new model for processor selection isalso utilized in the neighborhood structures for promotingefficiency of VNS different from the VNS proposed in [22]All of three advantages as previously mentioned enhance theability to get better rapidity of convergence and better searchresult in the whole solution space which is demonstrated bythe experimental results in Sections 53 and 54 The time

complexity of DRSCRO is 119874(iter times (2 times |119881| + 4 times |119864| times |119875| +

119889max times 119904119906119887119901119900119901119873119906119898))

5 Experimental Details

In this section the simulation experiment and comparativeevaluation of HEFT DMSCRO TMSCRO and proposedDRSCRO are presented As presented in [27 30] by theoryanalysis and experimental results TMSCRO and DMSCROproved to have better performance than GA therefore ourwork is the further study of CRO-based algorithms for DAGscheduling on heterogeneous systems and for DRSCRO asa metaheuristic algorithm we focus on the performance ofour proposed algorithm itself and the comparison betweenDRSCRO and other similar kinds of algorithms

First two extensive sets of graphs as the test beds forcomparative study are described Next the parameter settingswhich are used in the simulation experiments are pre-sented The results and analysis of the experiment includingmakespan test and convergence rate test are given in the finalpart

51 Test Bed As presented in [27 30] two extensive setsof DAGs real-world application and randomly generatedapplication graphs are considered as the test beds in theexperiments to enhance the comparability of various algo-rithms The first extensive test bed is two real-world problemDAGs molecular dynamics code [38] and Gaussian elimi-nation [8] Molecular dynamics are a computer simulationof physical movements of the molecules and atoms whichare allowed to interact for a period of time in the contextof N-body simulation Molecular dynamics code DAG isshown in Figure 8 Gaussian elimination is used to calculatethe solution for a linear equation system which is appliedsystematically to convert row operations on a set of linearequations to the upper triangular form As shown in Figure 9the total number of tasks in the Gaussian elimination DAGwith the matrix size of 7 is 27 and the largest task numberat the same level is 6 The reason of the utilization of thesetwo application graphs as a test bed is not only to enhancethe comparability of various algorithms but also to showthe function application of our proposed algorithm as anillustrative demonstration without loss of generality Thesecond extensive test bed for comparative study is the DAGsof random graphs A random graph generator presentedin [39] is implemented to generate random graphs in thesimulation experiment It allows the user to generate a varietyof random graphs with different characteristics such as CCRthe amount of calculation of a task the successor numberof a task and the total number of tasks in a random graphIt is also assumed that all tasks and communication linkshave the same computation cost and communication costrespectively

As shown in Figure 2 the next phase criteria and stoppingcriteria of DRSCRO are that the makespan stays unchangedfor 5000 consecutive iterations in the search loop And thestopping criterion of TMSCRO and DMSCRO is that themakespan remains the same for 10000 consecutive iterations

Mathematical Problems in Engineering 13

Entry

Exit

40

38

36

39

37

3433

27 28

2120

35

31 323029

22 23 24 25 26

321

4 5 6 7

131211 14

8 9 10

15 16 17 18 19

0

Figure 8 A molecular dynamics code DAG

52 Parameter Setting In the experiments a parameter hlis set to represent the heterogeneity level as presented inthe first paragraph of Section 3 It complies with the MHMmodel assumption and results in the fact that speeds of acomputing processor are different for different tasks In doingso the heterogeneity level (1 + hl)(1 minus hl) is equal to thebiggest possible ratio of the best processor speed to the worstprocessor speed for each task hl is set as the value tomake theheterogeneity level 2 unless otherwise specified in this paper

The details of parameter setting are shown in Table 4Theparameters 6ndash12 which are the CRO-based algorithms testedin the simulation are set as presented in [25]

53 Makespan Tests The performance of the proposed algo-rithm is compared with two state-of-the-art CRO-basedscheduling algorithms DMSCRO and TMSCRO and aheuristic algorithm HEFT Each makespan value plotted inthe graphs is the average value of a number of independentruns In the first extensive test bed themakespan is averagedover 10 independent runs (HEFT is run only once as adeterministic algorithm) while in the second extensive test

Exit

0

1 2 3 4 5 6

7

8

13

14

18

19

22

23

25

26

24

20 21

15 16 17

9 10 11 12

Entry

Figure 9 A Gaussian elimination DAG for a matrix of size 7

Table 4 Parameter values for simulation experiment

Parameter ValueCCR 01 02 1 2 5Number of processors 4 8 16 32

hl 0333Successor number of a task in a random graph 1 2 3 4

Total number of tasks in a random graph 10 20 50

InitialKE 1000120579 500120599 10Buffer 200KELossRate 02MoleColl 02PopSize 10

bed the makespan is averaged over 30 different randomgraph running instances Moreover to prove the robustness

14 Mathematical Problems in Engineering

0

20

40

60

80

100

120

140

CCR = 10

Mak

espa

n

HEFTDMSCRO

TMSCRODRSCRO

|P| = 4 |P| = 8 |P| = 16 |P| = 32

Figure 10 Average makespan for the molecular dynamics codeCCR = 10

0

20

40

60

80

100

120

140

CCR = 02

Mak

espa

n

HEFTDMSCRO

TMSCRODRSCRO

|P| = 4 |P| = 8 |P| = 16 |P| = 32

Figure 11 Average makespan for Gaussian elimination CCR = 02

of DRSCRO the best final value achieved in all these runsthe worst final value and the related standard deviation orvariance are also presented

531 Real-World Application Graphs Figures 10ndash13 showthe simulation experiment results of DRSCRO DMSCROTMSCRO and HEFT on the real-world application graphsand Tables 4ndash7 list the detail of the experimental results

As shown in Figures 10 and 11 it can be observed thatthe average makespan decreases as the processor numberincreases The results also show that DRSCRO TMSCROand DMSCRO achieve very similar performance which areall metaheuristic methods It is because according to No-Free-Lunch Theorem all well-designed metaheuristic meth-ods have the same performance on searching for optimalsolutions when averaged over all possible fitness functionsThe TMSCRO andDMSCROused in the simulation are well-designed and taken from the literature Therefore it provedthat DRSCRO developed in our work is also well-designed

A close observation of the results in Tables 10 and 11shows that DRSCRO outperforms TMSCRO and DMSCROon average slightly The reason is only because DRSCRO hasbetter capability of intensification search by applying VNS

01 02 1 2 50

50100150200250300350400450

CCR

Mak

espa

n

HEFTDMSCRO

TMSCRODRSCRO

Figure 12 Average makespan for the molecular dynamics code theprocessors number is 16

01 02 1 2 50

50100150200250300350400450

CCR

Mak

espa

n

HEFTDMSCRO

TMSCRODRSCRO

Figure 13 Average makespan for Gaussian elimination the proces-sors number is 8

and the utilization of one of its neighborhood structuresin the ineffective reaction operator as presented in the lastparagraph of Section 46 Therefore the performance of theaverage results obtained by DRSCRO is better than thatobtained by TMSCRO and DMSCRO when the stoppingcriterion is satisfied Moreover DRSCRO TMSCRO andDMSCRO typically outperform HEFT because they search awider area of the solution space as metaheuristic methodswhile the search of HEFT is narrowed down to a very smallerportion by means of the heuristics

Figures 12 and 13 and the results in Tables 7 and 8show the performance of the experimental results of thesefour algorithms with CCR value increasing It can be seenthat the makespan on average increases with the CCR valueincreasing It is because the heterogeneous processors are inthe idle state for longer as a result of the DAGs becomingmore communication-intensive It also can be observed thatDRSCRO TMSCRO and DMSCRO outperform HEFT andthe advantage becomes more significant with the value ofCCR increasing which suggest that heuristic algorithm likeHEFT has less consistent performance in a wide scheduling

Mathematical Problems in Engineering 15

Table 5 Experiment results for the molecular dynamics code CCR = 10

The number ofprocessors

HEFT(averagemakespan)

DMSCRO(averagemakespan)

TMSCRO(averagemakespan)

DRSCRO(averagemakespan)

DRSCRO(best

makespan)

DRSCRO(worst

makespan)

DRSCRO(variance)

4 120086 117624 116993 116759 114365 119589 24878 120074 116633 116007 115775 113402 118582 246616 120031 116101 115478 115247 112885 118041 245532 119993 115476 114856 114529 112182 117306 2440

Table 6 Experiment results for Gaussian elimination graph CCR = 02

The number ofprocessors

HEFT(averagemakespan)

DMSCRO(averagemakespan)

TMSCRO(averagemakespan)

DRSCRO(averagemakespan)

DRSCRO(best

makespan)

DRSCRO(worst

makespan)

DRSCRO(variance)

4 126852 124252 123585 123337 122042 126327 17698 96952 94965 94455 94266 92333 96551 200816 82356 80668 80235 80074 78433 82015 170632 75630 74080 73682 73535 72027 75317 1566

Table 7 Experiment results for the molecular dynamics code the processors number is 16

The value ofCCR

HEFT(averagemakespan)

DMSCRO(averagemakespan)

TMSCRO(averagemakespan)

DRSCRO(averagemakespan)

DRSCRO(best

makespan)

DRSCRO(worst

makespan)

DRSCRO(variance)

01 111782 109491 108903 108685 106457 111320 231502 112034 109737 109148 108930 106697 111571 23201 120031 116101 115478 115247 112885 118041 24552 160760 157465 156619 156306 153102 158532 30345 414581 406082 403902 403095 400878 404805 2125

Table 8 Experiment results for Gaussian elimination graph the processors number is 8

The value ofCCR

HEFT(averagemakespan)

DMSCRO(averagemakespan)

TMSCRO(averagemakespan)

DRSCRO(averagemakespan)

DRSCRO(best

makespan)

DRSCRO(worst

makespan)

DRSCRO(variance)

01 96612 94632 94124 93935 92010 96212 200102 96952 94965 94455 94266 92333 96551 20081 131094 128407 127717 127462 124849 130552 27152 222635 218071 216900 216467 214194 219550 24565 403600 395327 393204 392418 390260 394083 2069

scenario range and metaheuristic algorithm performs moreeffectively for communication-intensive DAGs

532 Randomly Generated Application Graphs As shownin Figures 14ndash16 randomly generated DAGs are used toevaluate the performance of DRSCRO TMSCRO DMSCROand HEFT in these experiments And the details of theseexperimental results are listed in Tables 9ndash11

Figure 14 shows the performance on the experimentalresults of these four algorithms with the processor numberincreasing As shown in Figure 14 DRSCRO always out-performs TMSCRO DMSCRO and HEFT as the numberof processors increases Figure 15 shows that DMSCRO has

better performance than the other three algorithms as thetask number increases The reasons for these are similarto those explained in the third paragraph of Section 531Figure 16 shows the makespan on average with CCR valuesincreasing It can be seem that the averagemakespan increasesrapidly with the increasing of the value of CCR As shown inFigure 16 the makespan on average increases rapidly whenthe value of CCR rises It is the fact that the DAG becomesmore communication-intensive with CCR increasing whichleads to the processors staying in the idle state for longer

54 Convergence Tests In this section the convergenceexperiments are conducted to show the change of makespan

16 Mathematical Problems in Engineering

Table 9 Experiment results for random graphs under different processor numbers task number is 50 and CCR = 02

The number ofprocessors

HEFT(averagemakespan)

DMSCRO(averagemakespan)

TMSCRO(averagemakespan)

DRSCRO(averagemakespan)

DRSCRO(best

makespan)

DRSCRO(worst

makespan)

DRSCRO(variance)

4 149400 146337 145552 145261 142283 148782 30948 119511 117061 116433 116200 113818 119017 247516 119473 117024 116396 116163 113782 118979 247432 119468 115550 114929 114700 112348 117480 2443

Table 10 Experiment results for random graphs under different task numbers processors number is 32 and CCR = 10

The numberof tasks

HEFT(averagemakespan)

DMSCRO(averagemakespan)

TMSCRO(averagemakespan)

DRSCRO(averagemakespan)

DRSCRO(best

makespan)

DRSCRO(worst

makespan)

DRSCRO(variance)

10 714484 706149 699100 690708 688982 693638 202520 1100918 1089685 1080428 1069082 1066410 1071479 261950 1666373 1662543 1649988 1634230 1633415 1636261 1165

Table 11 Experiment results of DRSCRO for the random graph under different CCRs the task number is 50

Value of CCR Processors number is 4 Processors number is 8 Processors number is 16 Processors number is 3201 145093 116068 116058 11459202 145261 116200 116163 1147001 153248 120228 119511 1185342 194401 166014 164246 1622925 508759 498427 492049 487830

0

50

100

150

CCR = 02

Mak

espa

n

HEFTDMSCRO

TMSCRODRSCRO

|P| = 4 |P| = 8 |P| = 16 |P| = 32

Figure 14 Average makespan for random graphs under differentprocessor numbers task number is 50 and CCR = 02

among DRSCRO TMSCRO and DMSCRO The conver-gence traces and significant tests are to further reveal thedifferences between DRSCRO and the other two algorithmsIn these experiments as suggested in [27] the stoppingcriteria of these three algorithms are that the total runningtime reaches a setting value (eg 180 s) Under the consid-eration of comparability the beginning of the time countingof DRSCRO is set as the start of the global optimizationphase processing In the first extensive test bed themakespanis averaged over 10 independent runs while in the secondextensive test bed the makespan is averaged over 30 differentrandom graph running instances

10 20 500

200400600800

10001200140016001800

The number of tasks

Mak

espa

n

HEFTDMSCRO

TMSCRODRSCRO

Figure 15 Average makespan for random graphs under differenttask numbers processors number is 32 and CCR = 10

541 Convergence Trace The convergence traces ofDRSCRO TMSCRO and DMSCRO for processing themolecular dynamics code and Gaussian elimination areplotted in Figures 17 and 18 respectively Figures 19ndash21show the convergence traces when processing the randomlygenerated DAG sets of which each contains 10 20 and50 tasks respectively As shown in Figures 17ndash21 it canbe observed that the convergence traces of these threealgorithms have obvious differences And the DRSCROconverges faster than the other two algorithms in every caseThe reason for the better rate of convergence of DRSCRO is as

Mathematical Problems in Engineering 17

01 02 1 2 50

100

200

300

400

500

600

CCR

Mak

espa

n

P = 32

P = 16P = 8P = 4

Figure 16 Average makespan of DRSCRO for the random graphunder different CCRs the task number is 50

0 1 2 3 4 5 6 7 8 9134

135

136

137

138

139

140

141

Time (Ms)

Mak

espa

n

DMSCROTMSCRO

DRSCRO

times104

Figure 17 Convergence trace for the molecular dynamics codeCCR = 1 and the number of processors is 16

Time (Ms)0 1 2 3 4 5 6 7 8 9

167

168

169

170

171

172

173

174

Mak

espa

n

DMSCROTMSCRO

DRSCRO

times104

Figure 18 Convergence trace for Gaussian elimination CCR = 02and the number of processors is 8

presented in the last paragraph of Section 46 (ie DRSCROtakes the advantage of its double-reaction structure toobtain a better super molecule for accelerating convergence)Even though the VNS algorithm adds the time cost in eachiteration the enhanced optimization capability of DRSCROalso makes it obtain a better coverage rate than TMSCRO

Time (Ms)0 1 2 3 4 5 6 7

505

51

515

52

525

53

535

54

Mak

espa

n

DMSCROTMSCRO

DRSCRO

times104

Figure 19 Convergence trace for the set of the randomly generatedDAGs with 10 tasks

Time (Ms)0 2 4 6 8 10 12 14

65566

66567

67568

68569

69570

705

Mak

espa

n

DMSCROTMSCRO

DRSCRO

times104

Figure 20 Convergence trace for the set of the randomly generatedDAGs with 20 tasks

Time (Ms)0 05 1 15 2 25 3 35 4 45

179

180

181

182

183

184

185

186

Mak

espa

n

DMSCROTMSCRO

DRSCRO

times105

Figure 21 Convergence trace for the set of the randomly generatedDAGs with 50 tasks

and DMSCRO The simulation experimental results showthat DRSCRO converges faster than TMSCRO by 194 onaverage (by 293 in the best case) and faster than DMSCROby 339 on average (by 412 in the best case)

Moreover the statistical analysis based on the averagevalues achieved is also presented in Section 542 to prove

18 Mathematical Problems in Engineering

Table 12 Results of Friedman tests 120572 = 005

Method Algorithm 119901 value Hypothesis

Friedman testDRSCRO DMSCRO 253119864 minus 02 RejectedDRSCRO TMSCRO 253119864 minus 02 Rejected

DRSCRO DMSCRO TMSCRO 670119864 minus 03 Rejected

Table 13 Results of Quade tests 120572 = 005

Method Algorithm 119901 value Hypothesis

Quade testDRSCRO DMSCRO 132119864 minus 02 RejectedDRSCRO TMSCRO 132119864 minus 02 Rejected

DRSCRO DMSCRO TMSCRO 109119864 minus 03 Rejected

that DRSCRO outperforms the other CRO-based algorithmsfor DAG scheduling from a statistical point of view

542 Significant Tests Statistical analysis is necessary for theaverage coverage rates obtained in all cases by DRSCROTMSCRO and DMSCRO which are metaheuristic methodsin order to find significant differences among these resultsNonparametric tests according to the recommendationsin [40] are specifically considered to be used since theexperimental results may present neither normal distributionnor variance homogeneity Therefore the Friedman test andthe Quade test are applied to check whether significantdifferences exist in the performance between these threealgorithms A significance level 120572 = 005 is used in allstatistical tests

Tables 12 and 13 respectively list the test results of theFriedman test and the Quade test which both reject the nullhypothesis of equivalent performance In both of these twotests our proposedDRSCRO is not only compared against allthe algorithms but also compared against the remaining onesas the control methodThe results in Tables 12 and 13 validatethe significant differences 120572 in the performance of DRSCROTMSCRO and DMSCRO

In sum it could be concluded that DRSCRO which is thecontrol algorithm statistically outperforms the other CRO-based DAG scheduling algorithm on coverage rate with asignificant level of 005

6 Discussion

The experimental results of makespan tests show that theperformance of DRSCRO is very similar to the other similarkinds of metaheuristic algorithms because when averagedover all possible fitness functions each well-designed meta-heuristic algorithm has the same performance for searchingoptimal solutions according to No-Free-Lunch TheoremHowever the proposed DRSCRO can achieve better perfor-mance and find good solutions faster than the other similarkinds of metaheuristic algorithms as the experimental resultsof convergence tests and the reason for it as the analysis inthe last paragraph in Section 46 is that DRSCROhas a bettersuper molecule creation bymetaheuristic method and underthe consideration of the optimization of scheduling orderand processor assignment DRSCRO takes the advantages of

VNS algorithm in the global optimization phase to improvethe optimization capability A load balance neighborhoodstructure is also applied in the ineffective reaction operatorfor a better intensification capability The new processorselection model utilized in the neighborhood structures alsopromotes the efficiency of VNS algorithm

7 Conclusion and Future Study

An algorithm named Double-Reaction-Structured CRO(DRSCRO) is developed for DAG scheduling on heteroge-neous systems in this paper DRSCRO includes two reactionphases one for super molecule selection and another forglobal optimizationThe phase of super molecule selection isused to obtain a super molecule by the metaheuristic methodfor better convergence rate different from other CRO-basedalgorithms forDAG scheduling on heterogeneous systems Inaddition to promote the intersection capability of DRSCROthe VNS algorithm which is with a new model for processorselection utilized in the neighborhood structures is used asthe initialization of global optimization phase and the loadbalance neighborhood structure of VNS is also applied inthe ineffective reaction operator The experimental resultsshow that DRSCRO can also achieve a higher speedup thanthe other CRO-based algorithms as far as we know AndDRSCRO algorithm can also obtain better performance onaveragemakespan in some cases

In future work we will analyze the parameter sensitivityof DRSCRO for promoting its activeness Moreover to makethe proposed algorithmmore practical DRSCROwill be alsoextended to aim at twomain objectives such as (1) minimiza-tion of schedule length (time domain) and (2) minimizationof number of used processors (resource domain)

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

This work is financially supported by the National NaturalScience Foundation of China (Grant no 61462073) and

Mathematical Problems in Engineering 19

the National High Technology Research and DevelopmentProgramof China (863 Program) (Grant no 2015AA020107)

References

[1] T Lewis and H Elrewini Introduction to Parallel ComputingPrentice-Hall New York NY USA 1992

[2] Y-K Kwok and I Ahmad ldquoStatic scheduling algorithms forallocating directed task graphs to multiprocessorsrdquo ACM Com-puting Surveys vol 31 no 4 pp 406ndash471 1999

[3] H Topcuoglu S Hariri and M-Y Wu ldquoPerformance-effectiveand low-complexity task scheduling for heterogeneous comput-ingrdquo IEEE Transactions on Parallel and Distributed Systems vol13 no 3 pp 260ndash274 2002

[4] Y-K Kwok and I Ahmad ldquoDynamic critical-path schedulingan effective technique for allocating task graphs to multiproces-sorsrdquo IEEETransactions on Parallel andDistributed Systems vol7 no 5 pp 506ndash521 1996

[5] F Suter F Desprez and H Casanova ldquoFrom heterogeneoustask scheduling to heterogeneous mixed parallel schedulingrdquo inEuro-Par 2004 Parallel Processing 10th International Euro-ParConference Pisa Italy August 31- September 3 2004 Proceed-ings vol 3149 of Lecture Notes in Computer Science pp 230ndash237Springer Berlin Germany 2004

[6] C-Y Lee J J Hwang Y-C Chow and F D Anger ldquoMultipro-cessor scheduling with interprocessor communication delaysrdquoOperations Research Letters vol 7 no 3 pp 141ndash147 1988

[7] M Maheswaran and H J Siegel ldquoA dynamic matching andscheduling algorithm for heterogeneous computing systemsrdquo inProceedings of the 7th Heterogeneous Computing Workshop pp57ndash69 IEEE Computer Society Orlando Fla USAMarch 1998

[8] M-Y Wu and D D Gajski ldquoHypertool a programming aidformessage-passing systemsrdquo IEEE Transactions on Parallel andDistributed Systems vol 1 no 3 pp 330ndash343 1990

[9] H El-Rewini and T G Lewis ldquoScheduling parallel programtasks onto arbitrary target machinesrdquo Journal of Parallel andDistributed Computing vol 9 no 2 pp 138ndash153 1990

[10] G C Sih and E A Lee ldquoCompile-time scheduling heuristic forinterconnection-constrained heterogeneous processor archi-tecturesrdquo IEEE Transactions on Parallel andDistributed Systemsvol 4 no 2 pp 175ndash187 1993

[11] B Kruatrachue and T Lewis ldquoGrain size determination forparallel processingrdquo IEEE Software vol 5 no 1 pp 23ndash32 1988

[12] M A Al-Mouhamed ldquoLower bound on the number of proces-sors and time for scheduling precedence graphs with communi-cation costsrdquo IEEETransactions on Software Engineering vol 16no 12 pp 1390ndash1401 1990

[13] A Tumeo C Pilato F Ferrandi D Sciuto and P L LanzildquoAnt colony optimization for mapping and scheduling in het-erogeneous multiprocessor systemsrdquo in Proceedings of the Inter-national Conference on Embedded Computer Systems Architec-turesModeling and Simulation (SAMOS rsquo08) pp 142ndash149 IEEEComputer Society Samos Greece June 2008

[14] I Ahmad and M K Dhodhi ldquoMultiprocessor scheduling in agenetic paradigmrdquo Parallel Computing vol 22 no 3 pp 395ndash406 1996

[15] M R Bonyadi and M E Moghaddam ldquoA bipartite geneticalgorithm for multi-processor task schedulingrdquo InternationalJournal of Parallel Programming vol 37 no 5 pp 462ndash487 2009

[16] R C Correa A Ferreira and P Rebreyend ldquoScheduling multi-processor tasks with genetic algorithmsrdquo IEEE Transactions onParallel andDistributed Systems vol 10 no 8 pp 825ndash837 1999

[17] E S H Hou N Ansari and H Ren ldquoGenetic algorithm formultiprocessor schedulingrdquo IEEE Transactions on Parallel andDistributed Systems vol 5 no 2 pp 113ndash120 1994

[18] F A Omara and M M Arafa ldquoGenetic algorithms for taskscheduling problemrdquo Journal of Parallel and Distributed Com-puting vol 70 no 1 pp 13ndash22 2010

[19] A Singh M Sevaux and A Rossi ldquoA hybrid grouping geneticalgorithm for multiprocessor schedulingrdquo in ContemporaryComputing Second International Conference IC3 2009 NoidaIndia August 17ndash19 2009 Proceedings vol 40 of Communica-tions in Computer and Information Science pp 1ndash7 SpringerBerlin Germany 2009

[20] A S Wu H Yu S Jin K-C Lin and G Schiavone ldquoAnincremental genetic algorithm approach to multiprocessorschedulingrdquo IEEE Transactions on Parallel and DistributedSystems vol 15 no 9 pp 824ndash834 2004

[21] Y Wen H Xu and J Yang ldquoA heuristic-based hybrid genetic-variable neighborhood search algorithm for task schedulingin heterogeneous multiprocessor systemrdquo Information Sciencesvol 181 no 3 pp 567ndash581 2011

[22] R Shanmugapriya S Padmavathi and S Shalinie ldquoContentionawareness in task scheduling using Tabu searchrdquo in Proceedingsof the IEEE International Advance Computing Conference pp272ndash277 IEEE Computer Society Patiala India March 2009

[23] S C Porto and C C Ribeiro ldquoA tabu search approach totask scheduling on heterogeneous processors under precedenceconstraintsrdquo International Journal of High Speed Computing vol7 no 1 pp 45ndash72 1995

[24] A V Kalashnikov and V A Kostenko ldquoA parallel algorithm ofsimulated annealing for multiprocessor schedulingrdquo Journal ofComputer and Systems Sciences International vol 47 no 3 pp455ndash463 2008

[25] A Y S Lam and V O K Li ldquoChemical-reaction-inspiredmeta-heuristic for optimizationrdquo IEEE Transactions on EvolutionaryComputation vol 14 no 3 pp 381ndash399 2010

[26] P Choudhury R Kumar and P P Chakrabarti ldquoHybridscheduling of dynamic task graphs with selective duplicationfor multiprocessors under memory and time constraintsrdquo IEEETransactions on Parallel and Distributed Systems vol 19 no 7pp 967ndash980 2008

[27] Y Xu K Li LHe andT K Truong ldquoADAG scheduling schemeon heterogeneous computing systems using double molecularstructure-based chemical reaction optimizationrdquo Journal ofParallel andDistributed Computing vol 73 no 9 pp 1306ndash13222013

[28] J Xu and A Lam ldquoChemical reaction optimization for the gridscheduling problemrdquo in Proceedings of the IEEE InternationalConference on Communications pp 1ndash5 IEEE Computer Soci-ety Cape Town South Africa May 2010

[29] B Varghese G McKee and V Alexandrov ldquoCan agent intel-ligence be used to achieve fault tolerant parallel computingsystemsrdquo Parallel Processing Letters vol 21 no 4 pp 379ndash3962011

[30] Y Jiang Z Shao and Y Guo ldquoA DAG scheduling scheme onheterogeneous computing systems using tuple-based chemicalreaction optimizationrdquo Scientific World Journal vol 2014 Arti-cle ID 404375 23 pages 2014

[31] J Xu AY S Lam and V O K Li ldquoStock portfolio selectionusing chemical reaction optimizationrdquo in Proceedings of theInternational Conference on Operations Research and FinancialEngineering pp 458ndash463 WASET Paris France June 2011

20 Mathematical Problems in Engineering

[32] N Mladenovic and P Hansen ldquoVariable neighborhood searchrdquoComputers amp Operations Research vol 24 no 11 pp 1097ndash11001997

[33] K Li X Tang and K Li ldquoEnergy-efficient stochastic taskscheduling on heterogeneous computing systemsrdquo IEEE Trans-actions on Parallel and Distributed Systems vol 25 no 11 pp2867ndash2876 2014

[34] D HWolpert andWGMacready ldquoNo free lunch theorems foroptimizationrdquo IEEE Transactions on Evolutionary Computationvol 1 no 1 pp 67ndash82 1997

[35] M A Khan ldquoScheduling for heterogeneous Systems usingconstrained critical pathsrdquo Parallel Computing vol 38 no 4-5pp 175ndash193 2012

[36] P Hansen N Mladenovic and J M Perez ldquoVariable neigh-borhood search methods and applicationsrdquo 4OR-A QuarterlyJournal of Operations Research vol 6 no 4 pp 319ndash360 2008

[37] F Glover and G Kochenberger Handbook of MetaheuristicsSpringer New York NY USA 2003

[38] S Kim and J Browne ldquoA general approach to mapping of par-allel computation upon multiprocessor architecturesrdquo in Pro-ceedings of the International Conference on Parallel Processingpp 1ndash8 Pennsylvania State University Pennsylvania State Uni-versity Press University Park Pa USA August 1988

[39] V A F Almeida I M M Vasconcelos J N C Arabe and D AMenasce ldquoUsing random task graphs to investigate the potentialbenefits of heterogeneity in parallel systemsrdquo in Proceedingsof the ACMIEEE Conference on Supercomputing pp 683ndash691IEEE Computer Society Minneapolis Minn USA November1992

[40] S Garcıa A Fernandez J Luengo and F Herrera ldquoAdvancednonparametric tests for multiple comparisons in the design ofexperiments in computational intelligence and data miningexperimental analysis of powerrdquo Information Sciences vol 180no 10 pp 2044ndash2064 2010

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 4: Research Article DRSCRO: A Metaheuristic Algorithm for ...downloads.hindawi.com/journals/mpe/2015/396582.pdf · Research Article DRSCRO: A Metaheuristic Algorithm for Task Scheduling

4 Mathematical Problems in Engineering

intensification capability And there is a newmodel proposedfor processor selection utilized in the neighborhoodstructures of the VNS algorithm for better effectiveness

Moreover in [21] VNS was incorporated with GA forDAG scheduling but the task priority was unchangeable inthe VNS algorithm in [21] which reduces the efficiency ofVNS to obtain a better solution So different from [21] topromote the intensification capability of the whole algorithmthe VNS in DRSCRO is modified under the consideration ofthe optimization of the scheduling order and the processorassignment both

3 Problem Formulation

The DAG scheduling problem is typically with two inputsa heterogeneous system for task computing in parallel anda parallel program of application (ie DAG) In this paperthe heterogeneous system is assumed as a static computingsystem model presented by 119875 = 119901

119894| 119894 = 1 2 3 |119875|

which is a fully connected network of processorsThe hetero-geneity level in this paper is formulated as (1+hl)(1minushl)where the parameter hl isin (0 1) In this paper EcCost

119901119895(V119894)

represents the computation cost of a task V119894mapped to the

processor 119901119895and the value of each EcCost

119901119895(V119894) is randomly

chosen within the scope of [1 minus hl 1 + hl]In general DAG = (119881 119864) consists of a task (node) set

119881 and an edge set 119864 EcCost119901119895

(V119894) is as defined in the first

paragraph of this section and the same processor executesa task in the DAG without preemption The constraintbetween tasks V

119894and V

119895is denoted as the edge 119890

119894119895(119890119894119895

isin

119864) which means that the execution of task V119895only after

the execution result of task V119894has been transmitted to task

V119895 Each edge 119890

119894119895has a nonnegative weight comm(V

119894 V119895)

denoting the communication cost between V119894and V

119895 Each

task in a DAG can only be executed on one processor andthe communication can be performed simultaneously bythe processors In addition when two communicating tasksare mapped to the same processor the communication costof them is zero Predecessor (V

119894) represents the set of the

predecessors of V119894 while successor (V

119894) represents the set of

the successors of V119894 The task with no predecessor is denoted

as Ventry while the task with no successor is denoted as VexitConsider that there is a DAGwith |119881| tasks to be mapped

to a heterogeneous system with |119875| processors Assuming thehighest-priority ready task V

119894on the processor 119901

119895 the earliest

start time of V119894 119879ESTime(V119894 119901119895) can be formulated as

119879ESTime (V119894 119901119895) = max 119879avail (119901

119895) 119879ready (V

119894 119901119895) (1)

where 119879avail(119901119895) can be defined as (2) 119879avail(119901119895) is the timewhen processor 119901

119895is available to the execution of the task V

119894

119879avail (119901119895) = max

V119896isinexec(119901119895)119879AFTime (V

119896) (2)

where exec(119901119895) represents all the tasks which have already

been scheduled on the processor 119901119895and 119879AFTime(V119896) denotes

the actual finish time when the task V119896finishes its execution

119879ready(V119894 119901119895) in (1) represents the time when all the dataneeded for the process of V

119896have been transmitted to 119901

119895

which is formulated as

119879ready (V119894 119901119895)

= maxV119896isinpredecessor(V119894)

119879AFTime (V119896) + comm (V

119896 V119894)

(3)

where 119879AFTime(V119896) has the same definition in (2) and prede-cessor (V

119894) denotes the set of all the immediate predecessors of

task V119894 comm(V

119896 V119894) is 0 if the task V

119896and task V

119894are mapped

to the same processor 119901119895

If task V119894is mapped to the processor 119901

119895with nonpreemp-

tive processing approach the earliest finish time of task V119894

119879EFTime(V119894 119901119895) is formulated as

119879EFTime (V119894 119901119895) = 119879ESTime (V

119894 119901119895) + EcCost

119901119895(V119894) (4)

After the task V119894is executed by the processor 119901

119895

119879EFTime(V119894 119901119895) is assigned to 119879AFTime(V119894) Themakespan of theentire parallel program is equivalent to the actual finish timeof exit task Vexit

119898119886119896119890119904119901119886119899 = maxV119894isin119881

119879AFTime (V119894) = 119879AFTime (Vexit) (5)

The computation of the communication-to-computationratio (CCR) can be formulated as in

CCR =

sum119890119894119895isin119864

comm (V119894 V119895)

sumV119894isin119881119882 (V119894)

(6)

where 119882(V119894) is the average computation cost of task V

119894and it

can be calculated as follows

119882 (V119894) =

|119875|

sum

119896=1

EcCost119901119896

(V119894)

|119875| (7)

A simple four-task DAG and a heterogeneous computa-tion system with three processors are shown in Figures 1(a)and 1(b) respectively The definition of the notations can befound in Table 2

4 Design of DRSCRO

DRSCRO imitates the molecular interactions in chemicalreactions based on the concepts of atoms molecule molec-ular structure and energy of a molecule In DRSCRO amolecule corresponds to a scheduling solution in DAGscheduling with a unique molecular structure representingthe atom positions in a molecule We utilize the molecularstructure of TMSCRO in our work under the considerationof its capability to represent the constrained relationshipbetween the tasks in a molecule (solution) In addition theenergy of each molecule corresponds to the fitness valueof a solution The molecular interactions try to reconstructmore stable molecular structure with lower energy There arefour kinds of basic chemical reactions on-wall ineffective

Mathematical Problems in Engineering 5

Table 2 Definitions of notations

Notations Definitions

DAG = (119881 119864) Input directed acyclic graph with |119881| nodes representing tasks and |119864| edgesrepresenting constrained relations among the tasks

119875 = 119901119894| 119894 = 1 2 3 |119875| Set of heterogeneous processors in target system

EcCost119901119895(V119894) Execution cost of task V

119894using processor 119901

119895

comm(V119896 V119894) Communication cost from task V

119896to task V

119894

119879ESTime(V119894 119901119895) The earliest start time of task V119894which is mapped to processor 119901

119895

119879EFTime(V119894 119901119895) The earliest finish time of task V119894which is mapped to processor 119901

119895

119879avail(119901119895) The time when processor 119901119895is available

119879ready(V119894 119901119895) The time when all the data needed for the process of V119894have been transmitted to 119901

119895

119879AFTime(V119896) Actual finish time when task V119896finishes its execution

predecessor(V119894) Set of the predecessors of task V

119894

successor(V119894) Set of the successors of task V

119894

exec(119901119895) Set of the tasks which have already been scheduled on the processor 119901

119895

119882(V) Average computation cost of task VCCR Communication-to-computation ratiohl Parameter for adjusting the heterogeneity level in a heterogeneous system

Start

End

516

1714

0

0

1(8)

4(10)

3(15)

2(14)

(a)

p2 p3

p1

(b)

Figure 1 (a) DAG model (b) Heterogeneous computation system model

collision decomposition intermolecular ineffective collisionand synthesis for molecular interactions in DRSCRO andeach kind of reaction contains two subclasses These twosubclasses of reaction operators are applied in the phase ofsuper molecule selection and the phase of global optimiza-tion respectively

41 Framework of DRSCRO The framework of DRSCRO toschedule a DAG job is as shown in Figure 2 with two basicphases the phase of supermolecule selection and the phase ofglobal optimization In each phase DRSCRO first initializes

the process of a phase and then the phase process entersiteration

In this framework DRSCRO first executes the phase ofsuper molecule selection to obtain the super molecule (iejust themolecule with the global minimalmakespan) SMolewith other output molecules as the input of the next globaloptimization phase for the first time (the input of VNSalgorithm is the population with SMole after each iterationin the global optimization phase in the other times) andthen DRSCRO performs the phase of global optimization toapproach the global optimum solution The VNS algorithm

6 Mathematical Problems in Engineering

The global min pointYes

Intermolecularcollision

Molecule selection (one is chosen)

Molecule selection (two or more are chosen)

Satisfy the criteria of decomposition for super

molecule

Satisfy the criteria

molecule selection

OnWallSMS DecompSMS IntermoleSMS SynthSMS

Fit evaluation

No Yes

YesNo No Yes

No

Start

Initialization of the reaction for super molecule selection

Next phase criteria matched

Stopping criteria matched

Initialization of the reaction for globaloptimalization (InitMoleGOVNS)

Yes

Intermolecularcollision

Molecule selection (oneis chosen)

Molecule selection (two or more are chosen)

Satisfy the criteria of decomposition for global

optimalization

Satisfy the criteria

OnWallGO DecompGO IntermoleGO SynthGO

Fit evaluation

No Yes

YesNo No Yes

No

selection

of synthesis for super

of synthesis for globaloptimalization

Figure 2 Framework of DRSCRO

Mathematical Problems in Engineering 7

(1)makespan = 0(2) for each node V

119896inm = ((V

1 1198911 1199011) (V2 1198912 1199012) (V

|119881| 119891|119881| 119901|119881|)) do

(3) calculate the actual finish time of V119896(ie 119879AFTime(V119896))

(4) if 119898119886119896119890119904119901119886119899 lt 119879AFTime(V119896)(5) updatemakespan(6) makespan = 119879AFTime(V119896)(7) end if(8) end for(9) returnmakespan

Algorithm 1 Fit(m) calculating the fitness value of a molecule and the processor allocation optimization

with a new model for processor selection is adopted asthe initialization of the global optimization phase and itis also utilized as a local search process to promote theintensification capability of DRSCROThere are four kinds ofelementary chemical reaction in DRSCRO on-wall collisiondecomposition intermolecular collision and synthesis Andeach kind of reaction contains two types of operators whichare respectively utilized in two phases of DRSCRO In eachiteration one of the elementary chemical reaction operatorsis performed to generate new molecules and the PEs of thenewly generated molecules (ie the fitness function valuesof the newly generated molecules) will be calculated Inaddition SMole will be tracked and only participates inon-wall ineffective collision and intermolecular ineffectivecollision in the global optimization phase to explore as muchas possible the solution space in its neighborhoods and themain purpose is to prevent the supermolecule from changingdramatically The iteration of each phase repeats until thestopping criteria (or next phase criteria) are met and SMoleand its fitness function value are just the final solutionand makespan (ie global min point) respectively In theimplementations of the experiments in this paper the nextphase criteria and the stop criteria ofDRSCROare set aswhenthere is no makespan improvement after 10000 consecutiveiterations in the search loop

42 Molecular Structure and Fitness Function This subsec-tion presents the encoding of scheduling solutions (ie themolecular structure) and the statement of the fitness functionin DRSCRO

421 Molecular Structure In this paper an atom with threeelements can be denoted as a tuple (V

119894 119891119894 119901119894) and the molec-

ular structure M with an array of tuples can be formulatedas in (8) to represent a solution to the DAG schedulingproblem The order of the tuples in M represents the priorityof each DAG task V

119894with the allocated processor 119901

119894 and V =

(V1 V2 V

|119881|) is a topological sequence of DAG which is

with the hypothetical entry task (with no predecessors) V1and

exit task (with no successors) V|119881| respectively representing

the beginning and end of execution Moreover if tuple A isbefore tuple B and VA is the predecessor of VB in DAG thesecond integer of tuple B 119891B will be 1 and vice versa

m = ((V1 1198911 1199011) (V2 1198912 1199012) (V

|119881| 119891|119881|

119901|119881|

)) (8)

422 Fitness Function Potential energy (PE) is defined asthe fitness function value of the corresponding solutionrepresented by 119878 The overall schedule length of the entireDAG namely makespan is the largest finish time amongall tasks which is equivalent to the actual finish time ofthe exit node in DAG In this paper the goal of DAGscheduling problem by DRSCRO is to obtain the schedulingthat minimizes makespan and ensure that the precedence ofthe tasks is not violated Hence each fitness function value isdefined as

Fit (m) = PEm = 119898119886119896119890119904119901119886119899 (9)

Algorithm 1 presents how to calculate the value of theoptimization fitness function Fit(m)

43 Super Molecule Selection Phase

431 Initialization There are two kinds of initial moleculegenerator one used in the phase of super molecule selectionand the other used in the phase of global optimization togenerate the initial solutions for DRSCRO tomanipulateThetuples of the first moleculem used in the initialization of thephase of super molecule selection are ascendingly orderedby the upward rank value [27] of their V

119894 and element three

119901119894of each tuple is generated by a random perturbation The

upward rank value can be calculated by

Rank (V119894)

= 119882 (V119894)

+ maxV119895isinsuccessor(V119894)

comm (V119894 V119895) + Rank (V

119895)

(10)

A detailed description of the initial molecule generator ofthe super molecule selection phase is given in Algorithm 2For the first input molecule m 119901

119909in each tuple inm is set as

1199011

432 Elementary Chemical Reaction Operators InDRSCROthe operators for super molecule selection just randomlychange 119901

119894of each tuple in a molecule as the intensification

searches or the diversification searches [25] to optimize theprocessor mapping of a solution Figures 3 4 5 and 6respectively show the examples of four operators for super

8 Mathematical Problems in Engineering

(1) 119872119900119897119890119873 = 1(2) while MoleN le PopSize do(3) for each 119901

119894in moleculem to randomly change

(4) change 119901119894randomly

(5) end for(6) generate a new moleculem1015840(7) MoleN =MoleN + 1(8) end while

Algorithm 2 InitMoleSMS(m) generating the initial populationfor the super molecule selection phase

Molecule

New molecule

(1 0 p1)

(1 0 p1)

(2 1 p2)

(2 1 p2)

(4 0 p2) (3 1 p3)

(4 0 p3) (3 1 p1)

Figure 3 Example of OnWallSMS

New molecule 1

New molecule 2

Molecule

(1 0 p3) (2 1 p2)

(4 0 p3)

(4 0 p1) (3 1 p1)

(3 1 p1)(2 1 p2)(1 0 p1)

(1 0 p1) (2 1 p3) (4 0 p3) (3 1 p2)

Figure 4 Example of DecompSMS

molecule selection in which the molecules correspond to theDAG as shown in Figure 1(a) And the white blocks in theseexamples denote the tuples that do not change during thereaction operation calculations

As shown in Figure 3 the operatorOnWallSMS is used togenerate a new molecule m1015840 from a given reaction moleculem for optimization OnWallSMS works as follows (1) Theoperator randomly chooses a tuple (V

119894 119891119894 119901119894) in m (2)

The operator changes 119901119894randomly In the end the operator

New molecule 1

New molecule 2

Molecule 1

Molecule 2

(1 0 p1)

(1 0 p1)

(1 0 p1)

(1 0 p1)

(2 1 p1)

(2 1 p2)

(2 1 p2)

(2 1 p1)

(3 1 p1)

(3 1 p3)

(4 0 p3)

(4 0 p3)

(4 0 p3)

(4 0 p1)

(3 1 p2)

(3 1 p3)

Figure 5 Example of IntermoleSMS

Molecule 1

Molecule 2

New molecule

(4 0 p3)

(4 0 p3)

(2 1 p1)

(2 1 p2)(1 0 p2)

(1 0 p2)

(1 0 p1) (2 1 p3)

(3 1 p3)

(3 1 p1)

(4 0 p3) (3 1 p1)

Figure 6 Example of SynthSMS

generates a new molecule m1015840 from m as an intensificationsearch

As shown in Figure 4 the operator DecompSMS is usedto generate newmoleculesm1015840

1andm1015840

2from a given reaction

moleculem DecompSMS works as follows (1) The operatorgenerates two molecules m1015840

1= m and m1015840

2= m (2)

The operator keeps the tuples in m10158401 which is at the odd

position in m and then changes the remaining 119901119909rsquos of tuples

inm10158401 randomly (3) The operator retains the tuples in m1015840

2

which is at the even position in m and then changes theremaining 119901

119909rsquos of tuples in m1015840

2randomly In the end the

operator generates two new molecules m10158401and m1015840

2from m

as a diversification search

Mathematical Problems in Engineering 9

(1) tempSet = pop set(2) pop subset = 0(3) if pop set is the input of the VNS algorithm for the first time (ie the output of the super molecule selection phase)(4) for each tempS in tempSet except SMole(5) choose a tuple (V

119894 119891119894 119901119894) in tempS where 119891

119894= 0 randomly

(6) generate a random number 119903119899119889 isin (0 1)(7) if rndge 05(8) find the first predecessor vj = Pred(vi) from vi to the begin in molecule tempS(9) interchanged position of (vi f i pi) and (V

119895+1 119891119895+1

119901119895+1

) in molecule tempS(10) update f i 119891119894+1 and 119891

119895+1as defined in the last paragraph of Section 421

(11) end if(12) for each pi in molecule tempS to randomly change(13) change pi randomly(14) end for(15) if Fit(tempS) lt Fit(SMole)(16) SMole = tempS(17) end if(18) end for(19) end if(20) pop subset adds SMole(21) pop subset adds the molecules in tempSet with tuple order different from SMole(22) while |119901119900119901 119904119906119887119904119890119905| = 119901119900119901 119904119906119887119904119890119905 119899119906119898 do(23) pop subset add a molecules in pop set which do not exist in pop subset(24) end while

Algorithm 3 InitVNS(pop set pop subset num) initializing the subset of the population pop set for undergoing the VNS algorithm

As shown in Figure 5 the operator IntermoleSMS isused to generate new molecules m1015840

1and m1015840

2from given

molecules m1and m

2 This operator first uses the steps in

OnWallSMS to generatem10158401fromm

1 and then the operator

generates the other newmoleculem10158402fromm

2in the similar

fashion In the end the operator generates two newmoleculesm10158401andm1015840

2from m

1and m

2as an intensification search

As shown in Figure 6 the operator SynthSMS is usedto generate a new molecule m1015840 from given molecules m

1

and m2 SynthSMS works as follows The operator keeps the

tuples inm1015840 which is at the same position inm1andm

2with

the same 119901119909rsquos and then changes the remaining 119901

119910rsquos in m1015840

randomly As a result the operator generatesm1015840 fromm1and

m2as a diversification search

44 Global Optimization Phase

441 Initialization VNS is utilized by our proposed algo-rithm as the initialization of the global optimization phaseand it is also as a local search process to promote theintensification capability of DRSCRO during the running ofthe whole algorithm

Algorithms 3 and 4 respectively present the subset gener-ator of the phase output and the main steps of the whole VNSalgorithm (ie the initialization of the global optimizationphase) In DRSCRO the VNS algorithm only processes thesubset of the population with the super molecule SMoleafter each iteration in the global optimization phase (theoutput of super molecule selection phase is the input of VNSfor the first time) As presented in Algorithm 3 if the pop set(ie the set of population) is the output of the super molecule

selection phase the tuple orders and 119901119894s of its elements

will be adjusted pop subset is the subset of population andpop subset num is the number of the elements in pop subsetwhich is set as PopSize times 50 in this paper

In Algorithm 4 different from the VNS proposed in [21]the task priority was changeable in theVNS algorithmused inDRSCRO the reason for which is that the unchangeable taskpriority in the VNS reduces its efficiency to obtain a bettersolutionTherefore under the consideration of the optimiza-tion of the scheduling order and the processor assignmentboth the input molecules of VNS can be with differenttuple order (ie task priority) as presented in Algorithm 3in each iteration 119889max is set to 2 as presented in [37] Asthe essential factor of VNS two neighborhood structuresload balance and communication reduction neighborhoodstructures which demonstrate their power in solving DAGscheduling problem on heterogeneous systems as presentedin [21] are adopted by the VNS algorithm in DRSCRO fortheir high efficiency In this paper a new model is alsoproposed for processor selection of these two neighborhoodstructures As presented in [21] there are two intuitionsused to construct the neighborhood structures One isthat balancing load among various processors usually helpsminimizing the makespan especially when most tasks areallocated to only a few processors the other is that reducingcommunication overhead and idle waiting time of processorsalways results in a more effective schedule especially givena relatively high unit communication However there is acontradiction between these two intuitions because reducingcommunication overhead and idle waiting time of processorsalways means that some processors are with most tasks

10 Mathematical Problems in Engineering

(1) pop subset = InitVNS(pop set)(2) Select the set of neighborhood structures 119873119890119894119892ℎ

119889(119889 = 1 2 3 119889max)

(3) for each individualm in the pop subset do(4) d = 1(5) while 119889 lt 119889max do(6) Randomly generate a moleculem

1from the 119889th neighborhood ofm

(7) Apply some local search method withm1as the initial molecule (the local optimum presented bym

2)

(8) If m2is better thanm

(9) m = m2

(10) 119889 = 1(11) else(12) d = d + 1(13) end if(14) end while(15) until the termination condition is satisfied(16) end for(17) execute the combination strategy

Algorithm 4 InitMoleGOVNS(pop set) generating the initial population of the global optimization phase

(1) for each processor 119901119894in the solution 120596 do

(2) Compute Tend(pi)(3) end for(4) Choose the processor 119901max with the largest Tend(119901max)(5) Randomly choose a task Vrandom from exec(119901max)(6) Randomly choose a processor 119901random different from 119901max(7) Reallocate Vrandom to the processor 119901random(8) Encode and reschedule the changed solution 1205961015840(9) return 1205961015840

Algorithm 5 GenNeighborhoodofLoadBalance(120596)

So different from the original ones in [21] we develop anew model for processor selection Let TCload(119901

119894) be all

the task execution cost of processor 119901119894 and TCcomm(119901

119894)

is the communication cost overhead of processor 119901119894as

defined in [21] The values of TCload(119901119894) and TCcomm(119901

119894)

are the tendencies of load balancing and communicationreducing respectively (ie the tendency of task reducing orincreasing) The greater TCload(119901

119894) is the stronger tendency

of reducing tasks on 119901119894is and the greater TCcomm(119901

119894) is the

stronger tendency of increasing tasks on 119901119894is Therefore a

parameter Tend(119901119894) is developed to measure the tendency

with the combination of TCload(119901119894) and TCcomm(119901

119894) as (11)

The neighborhood structure computation processes of loadbalance and communication reduction are as presented inAlgorithms 5 and 6 respectively The proposed model isunder the comprehensive consideration of both intuitionsand can make the VNS algorithm more effective than theoriginal one

Tend (119901119894) =

1

1 + 119890minus(TCload(119901119894)TCcomm(119901119894)) (11)

The VNS algorithm in DRSCRO utilizes the dual termi-nation criteria The termination criterion 1 sets the upperbound of the local search iterations to 20 and the termination

criterion 2 sets the maximum iteration number withoutimprovement to 3 The VNS algorithm will stop if eithercriterion is satisfied To form a new initial population acombination strategy is utilized for combining the currentpopulation and the VNS output after the VNS algorithmoutputs the subset of the population The current populationand the VNS output are first merged and sorted by increasingmakespan then the first PopSize molecules are selected togenerate the new initial population

442 Elementary Chemical Reaction Operators The opera-tors for global optimization not only vary 119901

119894of each tuple

but also interchange the positions of the tuples in a moleculeas the intensification searches or the diversification searches[25] to optimize the whole solution

On-wall ineffective collision (as an intensificationsearch) decomposition (as a diversification search) andsynthesis (as a diversification search) are as presented in [30]and we do not repeat them here to focus on our main workIn [30] the function of the ineffective collision operator issimilar to that of the on-wall ineffective collision operatorTherefore different from [30] amodified ineffective collisionoperator is proposed in this paper and it utilized the loadbalance neighborhood structure used in the VNS mentioned

Mathematical Problems in Engineering 11

(1) for each processor pi in the solution 120596 do(2) Compute Tend(pi)(3) end for(4) Choose the processor 119901min with the smallest Tend(119901min)(5) Set the candidate set cand empty(6) for each task vi in the set exec(119901min) do(7) Compute the set predecessor(vi)(8) Update predecessor(vi) with predecessor(vi) = predecessor(vi) minus exec(119901min)(9) cand = cand + predecessor(vi)(10) end for(11) Randomly choose a task Vrandom from cand(12) Reallocate Vrandom to the processor 119901min(13) Encode and reschedule the changed solution 1205961015840(14) return 1205961015840

Algorithm 6 GenNeighborhoodofCommReduction(120596)

(1) choose randomly a tuple (vi f i pi) inm1where f i = 0

(2) exchange the positions of (vi f i pi) and (V119894minus1119891119894minus1 119901119894minus1)

(3) modify 119891119894minus1 f i and 119891

119894+1inm1as defined in the last paragraph of Section 421

(4) generate a new moleculem10158401= GenLBNeighborhood(m

1)

(5) choose randomly a tuple (vi f i pi) inm2where f j = 0

(6) exchange the positions of (vi f i pi) and (V119895minus1

119891119895minus1

119901119895minus1

)(7) modify 119891

119895minus1 f j and 119891

119895+1inm2as defined in the last paragraph of Section 421

(8) generate a new moleculem10158402= GenLBNeighborhood(m

2)

Algorithm 7 IntermoleGO(m1m2)

Table 3 Execution cost of DAG tasks by each processor

Tasks 1199011

1199012

1199013

v1

7 8 9v2

12 14 16v3

14 15 16v4

13 3 14

before to promote the intensification capability of DRSCROand avoid the function duplication

The operator IntermoleGO (ie the ineffective collisionoperator) is used to generate new molecules m1015840

1and m1015840

2

from given moleculesm1andm

2 This operator first uses the

steps in OnWallGO to generate m10158401from m

1 and then the

operator generate the other newmoleculem10158402fromm

2in the

similar fashion In the end the operator generates two newmoleculesm1015840

1andm1015840

2fromm

1andm

2as an intensification

searchThe detailed executions are presented in Algorithm 7Figure 7 shows the example of the IntermoleGO in which themolecules correspond to the DAG as shown in Figure 1(a)

45 Illustrative Example Consider the example shown inFigure 1(a) Its edges are labeled with the communicationcosts whereas the execution costs are shown in Table 3

Initially the path (V1 V2 V4 V3) is found based on

the upward rank value of each task in DAG and the firstmolecule m = ((V

1 0 119901

1) (V2 1 1199011) (V4 0 119901

1) and (V

3

1 1199011)) can be obtained Algorithm 2 InitMoleSMS is then

executed to generate the initial population with 10 elements(ie PopSize is set as 10) for super molecule selection phaseThe initial population molecules are operated during theiterations in the super molecule selection phase as presentedin the framework of DRSCRO in Section 41 and the supermolecule SMole = ((V

1 0 1199013) (V2 1 1199011) (V4 0 1199012) and (V

3

1 1199011)) can be obtainedIn the global optimization phase Algorithm 4 InitMole-

GOVNS is then executed to generate (or to update) the initialpopulation after each iteration as presented in Section 41The molecules are operated during the iterations in theglobal optimization phase as presented in the framework ofDRSCRO in Section 41 and the global minimalmakespan =40 is finally obtained for which the corresponding solution(ie molecule) is ((V

1 0 1199012) (V4 1 1199012) (V2 0 1199012) and (V

3 1

1199012))

46 Analysis of DRSCRO As a new metaheuristic strategythe CRO-based methods for DAG scheduling which isproposed very recently have demonstrated the capability forsolving this kind of NP-hard optimization problems By ana-lyzing the framework molecular structure chemical reactionoperators and the operational environment in DRSCRO itcan be shown to some extent that DRSCRO scheme has theadvantage of three points in comparison with other CRO-based algorithms for DAG scheduling

12 Mathematical Problems in Engineering

New molecule 1

New molecule 2

Molecule 1

Molecule 2

(1 0 p2) (4 1 p2) (2 0 p3) (3 0 p2)

(3 0 p2)(4 1 p2)(2 1 p2)

(1 0 p1)

(1 0 p2)

(1 0 p1) (4 1 p1) (2 0 p1)

(4 0 p2)(2 1 p2)

(3 1 p1)

(3 0 p1)

Figure 7 Example of IntermoleGO

First to some degree super molecule in DRSCRO issimilar to InitS in TMSCRO [30] or the ldquoeliterdquo in GA[31] However the ldquoeliterdquo in GA is usually generated fromtwo chromosomes while super molecule is approached byexecuting the first phase of DRSCRO Moreover in compari-son with TMSCRO DRSCRO uses a metaheuristic strategy(CRO) to get a better super molecule It is because asintelligent random-search algorithm CRO used in the phaseof DRSCRO for super molecule selection searches a widerarea of the solution space than CEFT applied in TMSCROwhich narrow the search down to a very small portion ofthe solution space As a result a better super molecule maycontribute to a better global optimum solution and accel-erates convergence Second DAG scheduling problem hastwo complex aspects including task sequence optimizationand processor assignment optimization which lead to avery large and complicated solution space So for a bettercapability of intensification search than other CRO-basedalgorithms for DAG scheduling on heterogeneous systemsDRSCRO applied VNS algorithm as the initialization of theglobal optimization phase which is also as a local searchprocess during the running of DRSCRO and one of theneighborhood structures of VNS is also utilized in theineffective reaction operator Moreover during the runningof DRSCRO the task priority is changeable in our adoptedVNS algorithm and a new model for processor selection isalso utilized in the neighborhood structures for promotingefficiency of VNS different from the VNS proposed in [22]All of three advantages as previously mentioned enhance theability to get better rapidity of convergence and better searchresult in the whole solution space which is demonstrated bythe experimental results in Sections 53 and 54 The time

complexity of DRSCRO is 119874(iter times (2 times |119881| + 4 times |119864| times |119875| +

119889max times 119904119906119887119901119900119901119873119906119898))

5 Experimental Details

In this section the simulation experiment and comparativeevaluation of HEFT DMSCRO TMSCRO and proposedDRSCRO are presented As presented in [27 30] by theoryanalysis and experimental results TMSCRO and DMSCROproved to have better performance than GA therefore ourwork is the further study of CRO-based algorithms for DAGscheduling on heterogeneous systems and for DRSCRO asa metaheuristic algorithm we focus on the performance ofour proposed algorithm itself and the comparison betweenDRSCRO and other similar kinds of algorithms

First two extensive sets of graphs as the test beds forcomparative study are described Next the parameter settingswhich are used in the simulation experiments are pre-sented The results and analysis of the experiment includingmakespan test and convergence rate test are given in the finalpart

51 Test Bed As presented in [27 30] two extensive setsof DAGs real-world application and randomly generatedapplication graphs are considered as the test beds in theexperiments to enhance the comparability of various algo-rithms The first extensive test bed is two real-world problemDAGs molecular dynamics code [38] and Gaussian elimi-nation [8] Molecular dynamics are a computer simulationof physical movements of the molecules and atoms whichare allowed to interact for a period of time in the contextof N-body simulation Molecular dynamics code DAG isshown in Figure 8 Gaussian elimination is used to calculatethe solution for a linear equation system which is appliedsystematically to convert row operations on a set of linearequations to the upper triangular form As shown in Figure 9the total number of tasks in the Gaussian elimination DAGwith the matrix size of 7 is 27 and the largest task numberat the same level is 6 The reason of the utilization of thesetwo application graphs as a test bed is not only to enhancethe comparability of various algorithms but also to showthe function application of our proposed algorithm as anillustrative demonstration without loss of generality Thesecond extensive test bed for comparative study is the DAGsof random graphs A random graph generator presentedin [39] is implemented to generate random graphs in thesimulation experiment It allows the user to generate a varietyof random graphs with different characteristics such as CCRthe amount of calculation of a task the successor numberof a task and the total number of tasks in a random graphIt is also assumed that all tasks and communication linkshave the same computation cost and communication costrespectively

As shown in Figure 2 the next phase criteria and stoppingcriteria of DRSCRO are that the makespan stays unchangedfor 5000 consecutive iterations in the search loop And thestopping criterion of TMSCRO and DMSCRO is that themakespan remains the same for 10000 consecutive iterations

Mathematical Problems in Engineering 13

Entry

Exit

40

38

36

39

37

3433

27 28

2120

35

31 323029

22 23 24 25 26

321

4 5 6 7

131211 14

8 9 10

15 16 17 18 19

0

Figure 8 A molecular dynamics code DAG

52 Parameter Setting In the experiments a parameter hlis set to represent the heterogeneity level as presented inthe first paragraph of Section 3 It complies with the MHMmodel assumption and results in the fact that speeds of acomputing processor are different for different tasks In doingso the heterogeneity level (1 + hl)(1 minus hl) is equal to thebiggest possible ratio of the best processor speed to the worstprocessor speed for each task hl is set as the value tomake theheterogeneity level 2 unless otherwise specified in this paper

The details of parameter setting are shown in Table 4Theparameters 6ndash12 which are the CRO-based algorithms testedin the simulation are set as presented in [25]

53 Makespan Tests The performance of the proposed algo-rithm is compared with two state-of-the-art CRO-basedscheduling algorithms DMSCRO and TMSCRO and aheuristic algorithm HEFT Each makespan value plotted inthe graphs is the average value of a number of independentruns In the first extensive test bed themakespan is averagedover 10 independent runs (HEFT is run only once as adeterministic algorithm) while in the second extensive test

Exit

0

1 2 3 4 5 6

7

8

13

14

18

19

22

23

25

26

24

20 21

15 16 17

9 10 11 12

Entry

Figure 9 A Gaussian elimination DAG for a matrix of size 7

Table 4 Parameter values for simulation experiment

Parameter ValueCCR 01 02 1 2 5Number of processors 4 8 16 32

hl 0333Successor number of a task in a random graph 1 2 3 4

Total number of tasks in a random graph 10 20 50

InitialKE 1000120579 500120599 10Buffer 200KELossRate 02MoleColl 02PopSize 10

bed the makespan is averaged over 30 different randomgraph running instances Moreover to prove the robustness

14 Mathematical Problems in Engineering

0

20

40

60

80

100

120

140

CCR = 10

Mak

espa

n

HEFTDMSCRO

TMSCRODRSCRO

|P| = 4 |P| = 8 |P| = 16 |P| = 32

Figure 10 Average makespan for the molecular dynamics codeCCR = 10

0

20

40

60

80

100

120

140

CCR = 02

Mak

espa

n

HEFTDMSCRO

TMSCRODRSCRO

|P| = 4 |P| = 8 |P| = 16 |P| = 32

Figure 11 Average makespan for Gaussian elimination CCR = 02

of DRSCRO the best final value achieved in all these runsthe worst final value and the related standard deviation orvariance are also presented

531 Real-World Application Graphs Figures 10ndash13 showthe simulation experiment results of DRSCRO DMSCROTMSCRO and HEFT on the real-world application graphsand Tables 4ndash7 list the detail of the experimental results

As shown in Figures 10 and 11 it can be observed thatthe average makespan decreases as the processor numberincreases The results also show that DRSCRO TMSCROand DMSCRO achieve very similar performance which areall metaheuristic methods It is because according to No-Free-Lunch Theorem all well-designed metaheuristic meth-ods have the same performance on searching for optimalsolutions when averaged over all possible fitness functionsThe TMSCRO andDMSCROused in the simulation are well-designed and taken from the literature Therefore it provedthat DRSCRO developed in our work is also well-designed

A close observation of the results in Tables 10 and 11shows that DRSCRO outperforms TMSCRO and DMSCROon average slightly The reason is only because DRSCRO hasbetter capability of intensification search by applying VNS

01 02 1 2 50

50100150200250300350400450

CCR

Mak

espa

n

HEFTDMSCRO

TMSCRODRSCRO

Figure 12 Average makespan for the molecular dynamics code theprocessors number is 16

01 02 1 2 50

50100150200250300350400450

CCR

Mak

espa

n

HEFTDMSCRO

TMSCRODRSCRO

Figure 13 Average makespan for Gaussian elimination the proces-sors number is 8

and the utilization of one of its neighborhood structuresin the ineffective reaction operator as presented in the lastparagraph of Section 46 Therefore the performance of theaverage results obtained by DRSCRO is better than thatobtained by TMSCRO and DMSCRO when the stoppingcriterion is satisfied Moreover DRSCRO TMSCRO andDMSCRO typically outperform HEFT because they search awider area of the solution space as metaheuristic methodswhile the search of HEFT is narrowed down to a very smallerportion by means of the heuristics

Figures 12 and 13 and the results in Tables 7 and 8show the performance of the experimental results of thesefour algorithms with CCR value increasing It can be seenthat the makespan on average increases with the CCR valueincreasing It is because the heterogeneous processors are inthe idle state for longer as a result of the DAGs becomingmore communication-intensive It also can be observed thatDRSCRO TMSCRO and DMSCRO outperform HEFT andthe advantage becomes more significant with the value ofCCR increasing which suggest that heuristic algorithm likeHEFT has less consistent performance in a wide scheduling

Mathematical Problems in Engineering 15

Table 5 Experiment results for the molecular dynamics code CCR = 10

The number ofprocessors

HEFT(averagemakespan)

DMSCRO(averagemakespan)

TMSCRO(averagemakespan)

DRSCRO(averagemakespan)

DRSCRO(best

makespan)

DRSCRO(worst

makespan)

DRSCRO(variance)

4 120086 117624 116993 116759 114365 119589 24878 120074 116633 116007 115775 113402 118582 246616 120031 116101 115478 115247 112885 118041 245532 119993 115476 114856 114529 112182 117306 2440

Table 6 Experiment results for Gaussian elimination graph CCR = 02

The number ofprocessors

HEFT(averagemakespan)

DMSCRO(averagemakespan)

TMSCRO(averagemakespan)

DRSCRO(averagemakespan)

DRSCRO(best

makespan)

DRSCRO(worst

makespan)

DRSCRO(variance)

4 126852 124252 123585 123337 122042 126327 17698 96952 94965 94455 94266 92333 96551 200816 82356 80668 80235 80074 78433 82015 170632 75630 74080 73682 73535 72027 75317 1566

Table 7 Experiment results for the molecular dynamics code the processors number is 16

The value ofCCR

HEFT(averagemakespan)

DMSCRO(averagemakespan)

TMSCRO(averagemakespan)

DRSCRO(averagemakespan)

DRSCRO(best

makespan)

DRSCRO(worst

makespan)

DRSCRO(variance)

01 111782 109491 108903 108685 106457 111320 231502 112034 109737 109148 108930 106697 111571 23201 120031 116101 115478 115247 112885 118041 24552 160760 157465 156619 156306 153102 158532 30345 414581 406082 403902 403095 400878 404805 2125

Table 8 Experiment results for Gaussian elimination graph the processors number is 8

The value ofCCR

HEFT(averagemakespan)

DMSCRO(averagemakespan)

TMSCRO(averagemakespan)

DRSCRO(averagemakespan)

DRSCRO(best

makespan)

DRSCRO(worst

makespan)

DRSCRO(variance)

01 96612 94632 94124 93935 92010 96212 200102 96952 94965 94455 94266 92333 96551 20081 131094 128407 127717 127462 124849 130552 27152 222635 218071 216900 216467 214194 219550 24565 403600 395327 393204 392418 390260 394083 2069

scenario range and metaheuristic algorithm performs moreeffectively for communication-intensive DAGs

532 Randomly Generated Application Graphs As shownin Figures 14ndash16 randomly generated DAGs are used toevaluate the performance of DRSCRO TMSCRO DMSCROand HEFT in these experiments And the details of theseexperimental results are listed in Tables 9ndash11

Figure 14 shows the performance on the experimentalresults of these four algorithms with the processor numberincreasing As shown in Figure 14 DRSCRO always out-performs TMSCRO DMSCRO and HEFT as the numberof processors increases Figure 15 shows that DMSCRO has

better performance than the other three algorithms as thetask number increases The reasons for these are similarto those explained in the third paragraph of Section 531Figure 16 shows the makespan on average with CCR valuesincreasing It can be seem that the averagemakespan increasesrapidly with the increasing of the value of CCR As shown inFigure 16 the makespan on average increases rapidly whenthe value of CCR rises It is the fact that the DAG becomesmore communication-intensive with CCR increasing whichleads to the processors staying in the idle state for longer

54 Convergence Tests In this section the convergenceexperiments are conducted to show the change of makespan

16 Mathematical Problems in Engineering

Table 9 Experiment results for random graphs under different processor numbers task number is 50 and CCR = 02

The number ofprocessors

HEFT(averagemakespan)

DMSCRO(averagemakespan)

TMSCRO(averagemakespan)

DRSCRO(averagemakespan)

DRSCRO(best

makespan)

DRSCRO(worst

makespan)

DRSCRO(variance)

4 149400 146337 145552 145261 142283 148782 30948 119511 117061 116433 116200 113818 119017 247516 119473 117024 116396 116163 113782 118979 247432 119468 115550 114929 114700 112348 117480 2443

Table 10 Experiment results for random graphs under different task numbers processors number is 32 and CCR = 10

The numberof tasks

HEFT(averagemakespan)

DMSCRO(averagemakespan)

TMSCRO(averagemakespan)

DRSCRO(averagemakespan)

DRSCRO(best

makespan)

DRSCRO(worst

makespan)

DRSCRO(variance)

10 714484 706149 699100 690708 688982 693638 202520 1100918 1089685 1080428 1069082 1066410 1071479 261950 1666373 1662543 1649988 1634230 1633415 1636261 1165

Table 11 Experiment results of DRSCRO for the random graph under different CCRs the task number is 50

Value of CCR Processors number is 4 Processors number is 8 Processors number is 16 Processors number is 3201 145093 116068 116058 11459202 145261 116200 116163 1147001 153248 120228 119511 1185342 194401 166014 164246 1622925 508759 498427 492049 487830

0

50

100

150

CCR = 02

Mak

espa

n

HEFTDMSCRO

TMSCRODRSCRO

|P| = 4 |P| = 8 |P| = 16 |P| = 32

Figure 14 Average makespan for random graphs under differentprocessor numbers task number is 50 and CCR = 02

among DRSCRO TMSCRO and DMSCRO The conver-gence traces and significant tests are to further reveal thedifferences between DRSCRO and the other two algorithmsIn these experiments as suggested in [27] the stoppingcriteria of these three algorithms are that the total runningtime reaches a setting value (eg 180 s) Under the consid-eration of comparability the beginning of the time countingof DRSCRO is set as the start of the global optimizationphase processing In the first extensive test bed themakespanis averaged over 10 independent runs while in the secondextensive test bed the makespan is averaged over 30 differentrandom graph running instances

10 20 500

200400600800

10001200140016001800

The number of tasks

Mak

espa

n

HEFTDMSCRO

TMSCRODRSCRO

Figure 15 Average makespan for random graphs under differenttask numbers processors number is 32 and CCR = 10

541 Convergence Trace The convergence traces ofDRSCRO TMSCRO and DMSCRO for processing themolecular dynamics code and Gaussian elimination areplotted in Figures 17 and 18 respectively Figures 19ndash21show the convergence traces when processing the randomlygenerated DAG sets of which each contains 10 20 and50 tasks respectively As shown in Figures 17ndash21 it canbe observed that the convergence traces of these threealgorithms have obvious differences And the DRSCROconverges faster than the other two algorithms in every caseThe reason for the better rate of convergence of DRSCRO is as

Mathematical Problems in Engineering 17

01 02 1 2 50

100

200

300

400

500

600

CCR

Mak

espa

n

P = 32

P = 16P = 8P = 4

Figure 16 Average makespan of DRSCRO for the random graphunder different CCRs the task number is 50

0 1 2 3 4 5 6 7 8 9134

135

136

137

138

139

140

141

Time (Ms)

Mak

espa

n

DMSCROTMSCRO

DRSCRO

times104

Figure 17 Convergence trace for the molecular dynamics codeCCR = 1 and the number of processors is 16

Time (Ms)0 1 2 3 4 5 6 7 8 9

167

168

169

170

171

172

173

174

Mak

espa

n

DMSCROTMSCRO

DRSCRO

times104

Figure 18 Convergence trace for Gaussian elimination CCR = 02and the number of processors is 8

presented in the last paragraph of Section 46 (ie DRSCROtakes the advantage of its double-reaction structure toobtain a better super molecule for accelerating convergence)Even though the VNS algorithm adds the time cost in eachiteration the enhanced optimization capability of DRSCROalso makes it obtain a better coverage rate than TMSCRO

Time (Ms)0 1 2 3 4 5 6 7

505

51

515

52

525

53

535

54

Mak

espa

n

DMSCROTMSCRO

DRSCRO

times104

Figure 19 Convergence trace for the set of the randomly generatedDAGs with 10 tasks

Time (Ms)0 2 4 6 8 10 12 14

65566

66567

67568

68569

69570

705

Mak

espa

n

DMSCROTMSCRO

DRSCRO

times104

Figure 20 Convergence trace for the set of the randomly generatedDAGs with 20 tasks

Time (Ms)0 05 1 15 2 25 3 35 4 45

179

180

181

182

183

184

185

186

Mak

espa

n

DMSCROTMSCRO

DRSCRO

times105

Figure 21 Convergence trace for the set of the randomly generatedDAGs with 50 tasks

and DMSCRO The simulation experimental results showthat DRSCRO converges faster than TMSCRO by 194 onaverage (by 293 in the best case) and faster than DMSCROby 339 on average (by 412 in the best case)

Moreover the statistical analysis based on the averagevalues achieved is also presented in Section 542 to prove

18 Mathematical Problems in Engineering

Table 12 Results of Friedman tests 120572 = 005

Method Algorithm 119901 value Hypothesis

Friedman testDRSCRO DMSCRO 253119864 minus 02 RejectedDRSCRO TMSCRO 253119864 minus 02 Rejected

DRSCRO DMSCRO TMSCRO 670119864 minus 03 Rejected

Table 13 Results of Quade tests 120572 = 005

Method Algorithm 119901 value Hypothesis

Quade testDRSCRO DMSCRO 132119864 minus 02 RejectedDRSCRO TMSCRO 132119864 minus 02 Rejected

DRSCRO DMSCRO TMSCRO 109119864 minus 03 Rejected

that DRSCRO outperforms the other CRO-based algorithmsfor DAG scheduling from a statistical point of view

542 Significant Tests Statistical analysis is necessary for theaverage coverage rates obtained in all cases by DRSCROTMSCRO and DMSCRO which are metaheuristic methodsin order to find significant differences among these resultsNonparametric tests according to the recommendationsin [40] are specifically considered to be used since theexperimental results may present neither normal distributionnor variance homogeneity Therefore the Friedman test andthe Quade test are applied to check whether significantdifferences exist in the performance between these threealgorithms A significance level 120572 = 005 is used in allstatistical tests

Tables 12 and 13 respectively list the test results of theFriedman test and the Quade test which both reject the nullhypothesis of equivalent performance In both of these twotests our proposedDRSCRO is not only compared against allthe algorithms but also compared against the remaining onesas the control methodThe results in Tables 12 and 13 validatethe significant differences 120572 in the performance of DRSCROTMSCRO and DMSCRO

In sum it could be concluded that DRSCRO which is thecontrol algorithm statistically outperforms the other CRO-based DAG scheduling algorithm on coverage rate with asignificant level of 005

6 Discussion

The experimental results of makespan tests show that theperformance of DRSCRO is very similar to the other similarkinds of metaheuristic algorithms because when averagedover all possible fitness functions each well-designed meta-heuristic algorithm has the same performance for searchingoptimal solutions according to No-Free-Lunch TheoremHowever the proposed DRSCRO can achieve better perfor-mance and find good solutions faster than the other similarkinds of metaheuristic algorithms as the experimental resultsof convergence tests and the reason for it as the analysis inthe last paragraph in Section 46 is that DRSCROhas a bettersuper molecule creation bymetaheuristic method and underthe consideration of the optimization of scheduling orderand processor assignment DRSCRO takes the advantages of

VNS algorithm in the global optimization phase to improvethe optimization capability A load balance neighborhoodstructure is also applied in the ineffective reaction operatorfor a better intensification capability The new processorselection model utilized in the neighborhood structures alsopromotes the efficiency of VNS algorithm

7 Conclusion and Future Study

An algorithm named Double-Reaction-Structured CRO(DRSCRO) is developed for DAG scheduling on heteroge-neous systems in this paper DRSCRO includes two reactionphases one for super molecule selection and another forglobal optimizationThe phase of super molecule selection isused to obtain a super molecule by the metaheuristic methodfor better convergence rate different from other CRO-basedalgorithms forDAG scheduling on heterogeneous systems Inaddition to promote the intersection capability of DRSCROthe VNS algorithm which is with a new model for processorselection utilized in the neighborhood structures is used asthe initialization of global optimization phase and the loadbalance neighborhood structure of VNS is also applied inthe ineffective reaction operator The experimental resultsshow that DRSCRO can also achieve a higher speedup thanthe other CRO-based algorithms as far as we know AndDRSCRO algorithm can also obtain better performance onaveragemakespan in some cases

In future work we will analyze the parameter sensitivityof DRSCRO for promoting its activeness Moreover to makethe proposed algorithmmore practical DRSCROwill be alsoextended to aim at twomain objectives such as (1) minimiza-tion of schedule length (time domain) and (2) minimizationof number of used processors (resource domain)

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

This work is financially supported by the National NaturalScience Foundation of China (Grant no 61462073) and

Mathematical Problems in Engineering 19

the National High Technology Research and DevelopmentProgramof China (863 Program) (Grant no 2015AA020107)

References

[1] T Lewis and H Elrewini Introduction to Parallel ComputingPrentice-Hall New York NY USA 1992

[2] Y-K Kwok and I Ahmad ldquoStatic scheduling algorithms forallocating directed task graphs to multiprocessorsrdquo ACM Com-puting Surveys vol 31 no 4 pp 406ndash471 1999

[3] H Topcuoglu S Hariri and M-Y Wu ldquoPerformance-effectiveand low-complexity task scheduling for heterogeneous comput-ingrdquo IEEE Transactions on Parallel and Distributed Systems vol13 no 3 pp 260ndash274 2002

[4] Y-K Kwok and I Ahmad ldquoDynamic critical-path schedulingan effective technique for allocating task graphs to multiproces-sorsrdquo IEEETransactions on Parallel andDistributed Systems vol7 no 5 pp 506ndash521 1996

[5] F Suter F Desprez and H Casanova ldquoFrom heterogeneoustask scheduling to heterogeneous mixed parallel schedulingrdquo inEuro-Par 2004 Parallel Processing 10th International Euro-ParConference Pisa Italy August 31- September 3 2004 Proceed-ings vol 3149 of Lecture Notes in Computer Science pp 230ndash237Springer Berlin Germany 2004

[6] C-Y Lee J J Hwang Y-C Chow and F D Anger ldquoMultipro-cessor scheduling with interprocessor communication delaysrdquoOperations Research Letters vol 7 no 3 pp 141ndash147 1988

[7] M Maheswaran and H J Siegel ldquoA dynamic matching andscheduling algorithm for heterogeneous computing systemsrdquo inProceedings of the 7th Heterogeneous Computing Workshop pp57ndash69 IEEE Computer Society Orlando Fla USAMarch 1998

[8] M-Y Wu and D D Gajski ldquoHypertool a programming aidformessage-passing systemsrdquo IEEE Transactions on Parallel andDistributed Systems vol 1 no 3 pp 330ndash343 1990

[9] H El-Rewini and T G Lewis ldquoScheduling parallel programtasks onto arbitrary target machinesrdquo Journal of Parallel andDistributed Computing vol 9 no 2 pp 138ndash153 1990

[10] G C Sih and E A Lee ldquoCompile-time scheduling heuristic forinterconnection-constrained heterogeneous processor archi-tecturesrdquo IEEE Transactions on Parallel andDistributed Systemsvol 4 no 2 pp 175ndash187 1993

[11] B Kruatrachue and T Lewis ldquoGrain size determination forparallel processingrdquo IEEE Software vol 5 no 1 pp 23ndash32 1988

[12] M A Al-Mouhamed ldquoLower bound on the number of proces-sors and time for scheduling precedence graphs with communi-cation costsrdquo IEEETransactions on Software Engineering vol 16no 12 pp 1390ndash1401 1990

[13] A Tumeo C Pilato F Ferrandi D Sciuto and P L LanzildquoAnt colony optimization for mapping and scheduling in het-erogeneous multiprocessor systemsrdquo in Proceedings of the Inter-national Conference on Embedded Computer Systems Architec-turesModeling and Simulation (SAMOS rsquo08) pp 142ndash149 IEEEComputer Society Samos Greece June 2008

[14] I Ahmad and M K Dhodhi ldquoMultiprocessor scheduling in agenetic paradigmrdquo Parallel Computing vol 22 no 3 pp 395ndash406 1996

[15] M R Bonyadi and M E Moghaddam ldquoA bipartite geneticalgorithm for multi-processor task schedulingrdquo InternationalJournal of Parallel Programming vol 37 no 5 pp 462ndash487 2009

[16] R C Correa A Ferreira and P Rebreyend ldquoScheduling multi-processor tasks with genetic algorithmsrdquo IEEE Transactions onParallel andDistributed Systems vol 10 no 8 pp 825ndash837 1999

[17] E S H Hou N Ansari and H Ren ldquoGenetic algorithm formultiprocessor schedulingrdquo IEEE Transactions on Parallel andDistributed Systems vol 5 no 2 pp 113ndash120 1994

[18] F A Omara and M M Arafa ldquoGenetic algorithms for taskscheduling problemrdquo Journal of Parallel and Distributed Com-puting vol 70 no 1 pp 13ndash22 2010

[19] A Singh M Sevaux and A Rossi ldquoA hybrid grouping geneticalgorithm for multiprocessor schedulingrdquo in ContemporaryComputing Second International Conference IC3 2009 NoidaIndia August 17ndash19 2009 Proceedings vol 40 of Communica-tions in Computer and Information Science pp 1ndash7 SpringerBerlin Germany 2009

[20] A S Wu H Yu S Jin K-C Lin and G Schiavone ldquoAnincremental genetic algorithm approach to multiprocessorschedulingrdquo IEEE Transactions on Parallel and DistributedSystems vol 15 no 9 pp 824ndash834 2004

[21] Y Wen H Xu and J Yang ldquoA heuristic-based hybrid genetic-variable neighborhood search algorithm for task schedulingin heterogeneous multiprocessor systemrdquo Information Sciencesvol 181 no 3 pp 567ndash581 2011

[22] R Shanmugapriya S Padmavathi and S Shalinie ldquoContentionawareness in task scheduling using Tabu searchrdquo in Proceedingsof the IEEE International Advance Computing Conference pp272ndash277 IEEE Computer Society Patiala India March 2009

[23] S C Porto and C C Ribeiro ldquoA tabu search approach totask scheduling on heterogeneous processors under precedenceconstraintsrdquo International Journal of High Speed Computing vol7 no 1 pp 45ndash72 1995

[24] A V Kalashnikov and V A Kostenko ldquoA parallel algorithm ofsimulated annealing for multiprocessor schedulingrdquo Journal ofComputer and Systems Sciences International vol 47 no 3 pp455ndash463 2008

[25] A Y S Lam and V O K Li ldquoChemical-reaction-inspiredmeta-heuristic for optimizationrdquo IEEE Transactions on EvolutionaryComputation vol 14 no 3 pp 381ndash399 2010

[26] P Choudhury R Kumar and P P Chakrabarti ldquoHybridscheduling of dynamic task graphs with selective duplicationfor multiprocessors under memory and time constraintsrdquo IEEETransactions on Parallel and Distributed Systems vol 19 no 7pp 967ndash980 2008

[27] Y Xu K Li LHe andT K Truong ldquoADAG scheduling schemeon heterogeneous computing systems using double molecularstructure-based chemical reaction optimizationrdquo Journal ofParallel andDistributed Computing vol 73 no 9 pp 1306ndash13222013

[28] J Xu and A Lam ldquoChemical reaction optimization for the gridscheduling problemrdquo in Proceedings of the IEEE InternationalConference on Communications pp 1ndash5 IEEE Computer Soci-ety Cape Town South Africa May 2010

[29] B Varghese G McKee and V Alexandrov ldquoCan agent intel-ligence be used to achieve fault tolerant parallel computingsystemsrdquo Parallel Processing Letters vol 21 no 4 pp 379ndash3962011

[30] Y Jiang Z Shao and Y Guo ldquoA DAG scheduling scheme onheterogeneous computing systems using tuple-based chemicalreaction optimizationrdquo Scientific World Journal vol 2014 Arti-cle ID 404375 23 pages 2014

[31] J Xu AY S Lam and V O K Li ldquoStock portfolio selectionusing chemical reaction optimizationrdquo in Proceedings of theInternational Conference on Operations Research and FinancialEngineering pp 458ndash463 WASET Paris France June 2011

20 Mathematical Problems in Engineering

[32] N Mladenovic and P Hansen ldquoVariable neighborhood searchrdquoComputers amp Operations Research vol 24 no 11 pp 1097ndash11001997

[33] K Li X Tang and K Li ldquoEnergy-efficient stochastic taskscheduling on heterogeneous computing systemsrdquo IEEE Trans-actions on Parallel and Distributed Systems vol 25 no 11 pp2867ndash2876 2014

[34] D HWolpert andWGMacready ldquoNo free lunch theorems foroptimizationrdquo IEEE Transactions on Evolutionary Computationvol 1 no 1 pp 67ndash82 1997

[35] M A Khan ldquoScheduling for heterogeneous Systems usingconstrained critical pathsrdquo Parallel Computing vol 38 no 4-5pp 175ndash193 2012

[36] P Hansen N Mladenovic and J M Perez ldquoVariable neigh-borhood search methods and applicationsrdquo 4OR-A QuarterlyJournal of Operations Research vol 6 no 4 pp 319ndash360 2008

[37] F Glover and G Kochenberger Handbook of MetaheuristicsSpringer New York NY USA 2003

[38] S Kim and J Browne ldquoA general approach to mapping of par-allel computation upon multiprocessor architecturesrdquo in Pro-ceedings of the International Conference on Parallel Processingpp 1ndash8 Pennsylvania State University Pennsylvania State Uni-versity Press University Park Pa USA August 1988

[39] V A F Almeida I M M Vasconcelos J N C Arabe and D AMenasce ldquoUsing random task graphs to investigate the potentialbenefits of heterogeneity in parallel systemsrdquo in Proceedingsof the ACMIEEE Conference on Supercomputing pp 683ndash691IEEE Computer Society Minneapolis Minn USA November1992

[40] S Garcıa A Fernandez J Luengo and F Herrera ldquoAdvancednonparametric tests for multiple comparisons in the design ofexperiments in computational intelligence and data miningexperimental analysis of powerrdquo Information Sciences vol 180no 10 pp 2044ndash2064 2010

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 5: Research Article DRSCRO: A Metaheuristic Algorithm for ...downloads.hindawi.com/journals/mpe/2015/396582.pdf · Research Article DRSCRO: A Metaheuristic Algorithm for Task Scheduling

Mathematical Problems in Engineering 5

Table 2 Definitions of notations

Notations Definitions

DAG = (119881 119864) Input directed acyclic graph with |119881| nodes representing tasks and |119864| edgesrepresenting constrained relations among the tasks

119875 = 119901119894| 119894 = 1 2 3 |119875| Set of heterogeneous processors in target system

EcCost119901119895(V119894) Execution cost of task V

119894using processor 119901

119895

comm(V119896 V119894) Communication cost from task V

119896to task V

119894

119879ESTime(V119894 119901119895) The earliest start time of task V119894which is mapped to processor 119901

119895

119879EFTime(V119894 119901119895) The earliest finish time of task V119894which is mapped to processor 119901

119895

119879avail(119901119895) The time when processor 119901119895is available

119879ready(V119894 119901119895) The time when all the data needed for the process of V119894have been transmitted to 119901

119895

119879AFTime(V119896) Actual finish time when task V119896finishes its execution

predecessor(V119894) Set of the predecessors of task V

119894

successor(V119894) Set of the successors of task V

119894

exec(119901119895) Set of the tasks which have already been scheduled on the processor 119901

119895

119882(V) Average computation cost of task VCCR Communication-to-computation ratiohl Parameter for adjusting the heterogeneity level in a heterogeneous system

Start

End

516

1714

0

0

1(8)

4(10)

3(15)

2(14)

(a)

p2 p3

p1

(b)

Figure 1 (a) DAG model (b) Heterogeneous computation system model

collision decomposition intermolecular ineffective collisionand synthesis for molecular interactions in DRSCRO andeach kind of reaction contains two subclasses These twosubclasses of reaction operators are applied in the phase ofsuper molecule selection and the phase of global optimiza-tion respectively

41 Framework of DRSCRO The framework of DRSCRO toschedule a DAG job is as shown in Figure 2 with two basicphases the phase of supermolecule selection and the phase ofglobal optimization In each phase DRSCRO first initializes

the process of a phase and then the phase process entersiteration

In this framework DRSCRO first executes the phase ofsuper molecule selection to obtain the super molecule (iejust themolecule with the global minimalmakespan) SMolewith other output molecules as the input of the next globaloptimization phase for the first time (the input of VNSalgorithm is the population with SMole after each iterationin the global optimization phase in the other times) andthen DRSCRO performs the phase of global optimization toapproach the global optimum solution The VNS algorithm

6 Mathematical Problems in Engineering

The global min pointYes

Intermolecularcollision

Molecule selection (one is chosen)

Molecule selection (two or more are chosen)

Satisfy the criteria of decomposition for super

molecule

Satisfy the criteria

molecule selection

OnWallSMS DecompSMS IntermoleSMS SynthSMS

Fit evaluation

No Yes

YesNo No Yes

No

Start

Initialization of the reaction for super molecule selection

Next phase criteria matched

Stopping criteria matched

Initialization of the reaction for globaloptimalization (InitMoleGOVNS)

Yes

Intermolecularcollision

Molecule selection (oneis chosen)

Molecule selection (two or more are chosen)

Satisfy the criteria of decomposition for global

optimalization

Satisfy the criteria

OnWallGO DecompGO IntermoleGO SynthGO

Fit evaluation

No Yes

YesNo No Yes

No

selection

of synthesis for super

of synthesis for globaloptimalization

Figure 2 Framework of DRSCRO

Mathematical Problems in Engineering 7

(1)makespan = 0(2) for each node V

119896inm = ((V

1 1198911 1199011) (V2 1198912 1199012) (V

|119881| 119891|119881| 119901|119881|)) do

(3) calculate the actual finish time of V119896(ie 119879AFTime(V119896))

(4) if 119898119886119896119890119904119901119886119899 lt 119879AFTime(V119896)(5) updatemakespan(6) makespan = 119879AFTime(V119896)(7) end if(8) end for(9) returnmakespan

Algorithm 1 Fit(m) calculating the fitness value of a molecule and the processor allocation optimization

with a new model for processor selection is adopted asthe initialization of the global optimization phase and itis also utilized as a local search process to promote theintensification capability of DRSCROThere are four kinds ofelementary chemical reaction in DRSCRO on-wall collisiondecomposition intermolecular collision and synthesis Andeach kind of reaction contains two types of operators whichare respectively utilized in two phases of DRSCRO In eachiteration one of the elementary chemical reaction operatorsis performed to generate new molecules and the PEs of thenewly generated molecules (ie the fitness function valuesof the newly generated molecules) will be calculated Inaddition SMole will be tracked and only participates inon-wall ineffective collision and intermolecular ineffectivecollision in the global optimization phase to explore as muchas possible the solution space in its neighborhoods and themain purpose is to prevent the supermolecule from changingdramatically The iteration of each phase repeats until thestopping criteria (or next phase criteria) are met and SMoleand its fitness function value are just the final solutionand makespan (ie global min point) respectively In theimplementations of the experiments in this paper the nextphase criteria and the stop criteria ofDRSCROare set aswhenthere is no makespan improvement after 10000 consecutiveiterations in the search loop

42 Molecular Structure and Fitness Function This subsec-tion presents the encoding of scheduling solutions (ie themolecular structure) and the statement of the fitness functionin DRSCRO

421 Molecular Structure In this paper an atom with threeelements can be denoted as a tuple (V

119894 119891119894 119901119894) and the molec-

ular structure M with an array of tuples can be formulatedas in (8) to represent a solution to the DAG schedulingproblem The order of the tuples in M represents the priorityof each DAG task V

119894with the allocated processor 119901

119894 and V =

(V1 V2 V

|119881|) is a topological sequence of DAG which is

with the hypothetical entry task (with no predecessors) V1and

exit task (with no successors) V|119881| respectively representing

the beginning and end of execution Moreover if tuple A isbefore tuple B and VA is the predecessor of VB in DAG thesecond integer of tuple B 119891B will be 1 and vice versa

m = ((V1 1198911 1199011) (V2 1198912 1199012) (V

|119881| 119891|119881|

119901|119881|

)) (8)

422 Fitness Function Potential energy (PE) is defined asthe fitness function value of the corresponding solutionrepresented by 119878 The overall schedule length of the entireDAG namely makespan is the largest finish time amongall tasks which is equivalent to the actual finish time ofthe exit node in DAG In this paper the goal of DAGscheduling problem by DRSCRO is to obtain the schedulingthat minimizes makespan and ensure that the precedence ofthe tasks is not violated Hence each fitness function value isdefined as

Fit (m) = PEm = 119898119886119896119890119904119901119886119899 (9)

Algorithm 1 presents how to calculate the value of theoptimization fitness function Fit(m)

43 Super Molecule Selection Phase

431 Initialization There are two kinds of initial moleculegenerator one used in the phase of super molecule selectionand the other used in the phase of global optimization togenerate the initial solutions for DRSCRO tomanipulateThetuples of the first moleculem used in the initialization of thephase of super molecule selection are ascendingly orderedby the upward rank value [27] of their V

119894 and element three

119901119894of each tuple is generated by a random perturbation The

upward rank value can be calculated by

Rank (V119894)

= 119882 (V119894)

+ maxV119895isinsuccessor(V119894)

comm (V119894 V119895) + Rank (V

119895)

(10)

A detailed description of the initial molecule generator ofthe super molecule selection phase is given in Algorithm 2For the first input molecule m 119901

119909in each tuple inm is set as

1199011

432 Elementary Chemical Reaction Operators InDRSCROthe operators for super molecule selection just randomlychange 119901

119894of each tuple in a molecule as the intensification

searches or the diversification searches [25] to optimize theprocessor mapping of a solution Figures 3 4 5 and 6respectively show the examples of four operators for super

8 Mathematical Problems in Engineering

(1) 119872119900119897119890119873 = 1(2) while MoleN le PopSize do(3) for each 119901

119894in moleculem to randomly change

(4) change 119901119894randomly

(5) end for(6) generate a new moleculem1015840(7) MoleN =MoleN + 1(8) end while

Algorithm 2 InitMoleSMS(m) generating the initial populationfor the super molecule selection phase

Molecule

New molecule

(1 0 p1)

(1 0 p1)

(2 1 p2)

(2 1 p2)

(4 0 p2) (3 1 p3)

(4 0 p3) (3 1 p1)

Figure 3 Example of OnWallSMS

New molecule 1

New molecule 2

Molecule

(1 0 p3) (2 1 p2)

(4 0 p3)

(4 0 p1) (3 1 p1)

(3 1 p1)(2 1 p2)(1 0 p1)

(1 0 p1) (2 1 p3) (4 0 p3) (3 1 p2)

Figure 4 Example of DecompSMS

molecule selection in which the molecules correspond to theDAG as shown in Figure 1(a) And the white blocks in theseexamples denote the tuples that do not change during thereaction operation calculations

As shown in Figure 3 the operatorOnWallSMS is used togenerate a new molecule m1015840 from a given reaction moleculem for optimization OnWallSMS works as follows (1) Theoperator randomly chooses a tuple (V

119894 119891119894 119901119894) in m (2)

The operator changes 119901119894randomly In the end the operator

New molecule 1

New molecule 2

Molecule 1

Molecule 2

(1 0 p1)

(1 0 p1)

(1 0 p1)

(1 0 p1)

(2 1 p1)

(2 1 p2)

(2 1 p2)

(2 1 p1)

(3 1 p1)

(3 1 p3)

(4 0 p3)

(4 0 p3)

(4 0 p3)

(4 0 p1)

(3 1 p2)

(3 1 p3)

Figure 5 Example of IntermoleSMS

Molecule 1

Molecule 2

New molecule

(4 0 p3)

(4 0 p3)

(2 1 p1)

(2 1 p2)(1 0 p2)

(1 0 p2)

(1 0 p1) (2 1 p3)

(3 1 p3)

(3 1 p1)

(4 0 p3) (3 1 p1)

Figure 6 Example of SynthSMS

generates a new molecule m1015840 from m as an intensificationsearch

As shown in Figure 4 the operator DecompSMS is usedto generate newmoleculesm1015840

1andm1015840

2from a given reaction

moleculem DecompSMS works as follows (1) The operatorgenerates two molecules m1015840

1= m and m1015840

2= m (2)

The operator keeps the tuples in m10158401 which is at the odd

position in m and then changes the remaining 119901119909rsquos of tuples

inm10158401 randomly (3) The operator retains the tuples in m1015840

2

which is at the even position in m and then changes theremaining 119901

119909rsquos of tuples in m1015840

2randomly In the end the

operator generates two new molecules m10158401and m1015840

2from m

as a diversification search

Mathematical Problems in Engineering 9

(1) tempSet = pop set(2) pop subset = 0(3) if pop set is the input of the VNS algorithm for the first time (ie the output of the super molecule selection phase)(4) for each tempS in tempSet except SMole(5) choose a tuple (V

119894 119891119894 119901119894) in tempS where 119891

119894= 0 randomly

(6) generate a random number 119903119899119889 isin (0 1)(7) if rndge 05(8) find the first predecessor vj = Pred(vi) from vi to the begin in molecule tempS(9) interchanged position of (vi f i pi) and (V

119895+1 119891119895+1

119901119895+1

) in molecule tempS(10) update f i 119891119894+1 and 119891

119895+1as defined in the last paragraph of Section 421

(11) end if(12) for each pi in molecule tempS to randomly change(13) change pi randomly(14) end for(15) if Fit(tempS) lt Fit(SMole)(16) SMole = tempS(17) end if(18) end for(19) end if(20) pop subset adds SMole(21) pop subset adds the molecules in tempSet with tuple order different from SMole(22) while |119901119900119901 119904119906119887119904119890119905| = 119901119900119901 119904119906119887119904119890119905 119899119906119898 do(23) pop subset add a molecules in pop set which do not exist in pop subset(24) end while

Algorithm 3 InitVNS(pop set pop subset num) initializing the subset of the population pop set for undergoing the VNS algorithm

As shown in Figure 5 the operator IntermoleSMS isused to generate new molecules m1015840

1and m1015840

2from given

molecules m1and m

2 This operator first uses the steps in

OnWallSMS to generatem10158401fromm

1 and then the operator

generates the other newmoleculem10158402fromm

2in the similar

fashion In the end the operator generates two newmoleculesm10158401andm1015840

2from m

1and m

2as an intensification search

As shown in Figure 6 the operator SynthSMS is usedto generate a new molecule m1015840 from given molecules m

1

and m2 SynthSMS works as follows The operator keeps the

tuples inm1015840 which is at the same position inm1andm

2with

the same 119901119909rsquos and then changes the remaining 119901

119910rsquos in m1015840

randomly As a result the operator generatesm1015840 fromm1and

m2as a diversification search

44 Global Optimization Phase

441 Initialization VNS is utilized by our proposed algo-rithm as the initialization of the global optimization phaseand it is also as a local search process to promote theintensification capability of DRSCRO during the running ofthe whole algorithm

Algorithms 3 and 4 respectively present the subset gener-ator of the phase output and the main steps of the whole VNSalgorithm (ie the initialization of the global optimizationphase) In DRSCRO the VNS algorithm only processes thesubset of the population with the super molecule SMoleafter each iteration in the global optimization phase (theoutput of super molecule selection phase is the input of VNSfor the first time) As presented in Algorithm 3 if the pop set(ie the set of population) is the output of the super molecule

selection phase the tuple orders and 119901119894s of its elements

will be adjusted pop subset is the subset of population andpop subset num is the number of the elements in pop subsetwhich is set as PopSize times 50 in this paper

In Algorithm 4 different from the VNS proposed in [21]the task priority was changeable in theVNS algorithmused inDRSCRO the reason for which is that the unchangeable taskpriority in the VNS reduces its efficiency to obtain a bettersolutionTherefore under the consideration of the optimiza-tion of the scheduling order and the processor assignmentboth the input molecules of VNS can be with differenttuple order (ie task priority) as presented in Algorithm 3in each iteration 119889max is set to 2 as presented in [37] Asthe essential factor of VNS two neighborhood structuresload balance and communication reduction neighborhoodstructures which demonstrate their power in solving DAGscheduling problem on heterogeneous systems as presentedin [21] are adopted by the VNS algorithm in DRSCRO fortheir high efficiency In this paper a new model is alsoproposed for processor selection of these two neighborhoodstructures As presented in [21] there are two intuitionsused to construct the neighborhood structures One isthat balancing load among various processors usually helpsminimizing the makespan especially when most tasks areallocated to only a few processors the other is that reducingcommunication overhead and idle waiting time of processorsalways results in a more effective schedule especially givena relatively high unit communication However there is acontradiction between these two intuitions because reducingcommunication overhead and idle waiting time of processorsalways means that some processors are with most tasks

10 Mathematical Problems in Engineering

(1) pop subset = InitVNS(pop set)(2) Select the set of neighborhood structures 119873119890119894119892ℎ

119889(119889 = 1 2 3 119889max)

(3) for each individualm in the pop subset do(4) d = 1(5) while 119889 lt 119889max do(6) Randomly generate a moleculem

1from the 119889th neighborhood ofm

(7) Apply some local search method withm1as the initial molecule (the local optimum presented bym

2)

(8) If m2is better thanm

(9) m = m2

(10) 119889 = 1(11) else(12) d = d + 1(13) end if(14) end while(15) until the termination condition is satisfied(16) end for(17) execute the combination strategy

Algorithm 4 InitMoleGOVNS(pop set) generating the initial population of the global optimization phase

(1) for each processor 119901119894in the solution 120596 do

(2) Compute Tend(pi)(3) end for(4) Choose the processor 119901max with the largest Tend(119901max)(5) Randomly choose a task Vrandom from exec(119901max)(6) Randomly choose a processor 119901random different from 119901max(7) Reallocate Vrandom to the processor 119901random(8) Encode and reschedule the changed solution 1205961015840(9) return 1205961015840

Algorithm 5 GenNeighborhoodofLoadBalance(120596)

So different from the original ones in [21] we develop anew model for processor selection Let TCload(119901

119894) be all

the task execution cost of processor 119901119894 and TCcomm(119901

119894)

is the communication cost overhead of processor 119901119894as

defined in [21] The values of TCload(119901119894) and TCcomm(119901

119894)

are the tendencies of load balancing and communicationreducing respectively (ie the tendency of task reducing orincreasing) The greater TCload(119901

119894) is the stronger tendency

of reducing tasks on 119901119894is and the greater TCcomm(119901

119894) is the

stronger tendency of increasing tasks on 119901119894is Therefore a

parameter Tend(119901119894) is developed to measure the tendency

with the combination of TCload(119901119894) and TCcomm(119901

119894) as (11)

The neighborhood structure computation processes of loadbalance and communication reduction are as presented inAlgorithms 5 and 6 respectively The proposed model isunder the comprehensive consideration of both intuitionsand can make the VNS algorithm more effective than theoriginal one

Tend (119901119894) =

1

1 + 119890minus(TCload(119901119894)TCcomm(119901119894)) (11)

The VNS algorithm in DRSCRO utilizes the dual termi-nation criteria The termination criterion 1 sets the upperbound of the local search iterations to 20 and the termination

criterion 2 sets the maximum iteration number withoutimprovement to 3 The VNS algorithm will stop if eithercriterion is satisfied To form a new initial population acombination strategy is utilized for combining the currentpopulation and the VNS output after the VNS algorithmoutputs the subset of the population The current populationand the VNS output are first merged and sorted by increasingmakespan then the first PopSize molecules are selected togenerate the new initial population

442 Elementary Chemical Reaction Operators The opera-tors for global optimization not only vary 119901

119894of each tuple

but also interchange the positions of the tuples in a moleculeas the intensification searches or the diversification searches[25] to optimize the whole solution

On-wall ineffective collision (as an intensificationsearch) decomposition (as a diversification search) andsynthesis (as a diversification search) are as presented in [30]and we do not repeat them here to focus on our main workIn [30] the function of the ineffective collision operator issimilar to that of the on-wall ineffective collision operatorTherefore different from [30] amodified ineffective collisionoperator is proposed in this paper and it utilized the loadbalance neighborhood structure used in the VNS mentioned

Mathematical Problems in Engineering 11

(1) for each processor pi in the solution 120596 do(2) Compute Tend(pi)(3) end for(4) Choose the processor 119901min with the smallest Tend(119901min)(5) Set the candidate set cand empty(6) for each task vi in the set exec(119901min) do(7) Compute the set predecessor(vi)(8) Update predecessor(vi) with predecessor(vi) = predecessor(vi) minus exec(119901min)(9) cand = cand + predecessor(vi)(10) end for(11) Randomly choose a task Vrandom from cand(12) Reallocate Vrandom to the processor 119901min(13) Encode and reschedule the changed solution 1205961015840(14) return 1205961015840

Algorithm 6 GenNeighborhoodofCommReduction(120596)

(1) choose randomly a tuple (vi f i pi) inm1where f i = 0

(2) exchange the positions of (vi f i pi) and (V119894minus1119891119894minus1 119901119894minus1)

(3) modify 119891119894minus1 f i and 119891

119894+1inm1as defined in the last paragraph of Section 421

(4) generate a new moleculem10158401= GenLBNeighborhood(m

1)

(5) choose randomly a tuple (vi f i pi) inm2where f j = 0

(6) exchange the positions of (vi f i pi) and (V119895minus1

119891119895minus1

119901119895minus1

)(7) modify 119891

119895minus1 f j and 119891

119895+1inm2as defined in the last paragraph of Section 421

(8) generate a new moleculem10158402= GenLBNeighborhood(m

2)

Algorithm 7 IntermoleGO(m1m2)

Table 3 Execution cost of DAG tasks by each processor

Tasks 1199011

1199012

1199013

v1

7 8 9v2

12 14 16v3

14 15 16v4

13 3 14

before to promote the intensification capability of DRSCROand avoid the function duplication

The operator IntermoleGO (ie the ineffective collisionoperator) is used to generate new molecules m1015840

1and m1015840

2

from given moleculesm1andm

2 This operator first uses the

steps in OnWallGO to generate m10158401from m

1 and then the

operator generate the other newmoleculem10158402fromm

2in the

similar fashion In the end the operator generates two newmoleculesm1015840

1andm1015840

2fromm

1andm

2as an intensification

searchThe detailed executions are presented in Algorithm 7Figure 7 shows the example of the IntermoleGO in which themolecules correspond to the DAG as shown in Figure 1(a)

45 Illustrative Example Consider the example shown inFigure 1(a) Its edges are labeled with the communicationcosts whereas the execution costs are shown in Table 3

Initially the path (V1 V2 V4 V3) is found based on

the upward rank value of each task in DAG and the firstmolecule m = ((V

1 0 119901

1) (V2 1 1199011) (V4 0 119901

1) and (V

3

1 1199011)) can be obtained Algorithm 2 InitMoleSMS is then

executed to generate the initial population with 10 elements(ie PopSize is set as 10) for super molecule selection phaseThe initial population molecules are operated during theiterations in the super molecule selection phase as presentedin the framework of DRSCRO in Section 41 and the supermolecule SMole = ((V

1 0 1199013) (V2 1 1199011) (V4 0 1199012) and (V

3

1 1199011)) can be obtainedIn the global optimization phase Algorithm 4 InitMole-

GOVNS is then executed to generate (or to update) the initialpopulation after each iteration as presented in Section 41The molecules are operated during the iterations in theglobal optimization phase as presented in the framework ofDRSCRO in Section 41 and the global minimalmakespan =40 is finally obtained for which the corresponding solution(ie molecule) is ((V

1 0 1199012) (V4 1 1199012) (V2 0 1199012) and (V

3 1

1199012))

46 Analysis of DRSCRO As a new metaheuristic strategythe CRO-based methods for DAG scheduling which isproposed very recently have demonstrated the capability forsolving this kind of NP-hard optimization problems By ana-lyzing the framework molecular structure chemical reactionoperators and the operational environment in DRSCRO itcan be shown to some extent that DRSCRO scheme has theadvantage of three points in comparison with other CRO-based algorithms for DAG scheduling

12 Mathematical Problems in Engineering

New molecule 1

New molecule 2

Molecule 1

Molecule 2

(1 0 p2) (4 1 p2) (2 0 p3) (3 0 p2)

(3 0 p2)(4 1 p2)(2 1 p2)

(1 0 p1)

(1 0 p2)

(1 0 p1) (4 1 p1) (2 0 p1)

(4 0 p2)(2 1 p2)

(3 1 p1)

(3 0 p1)

Figure 7 Example of IntermoleGO

First to some degree super molecule in DRSCRO issimilar to InitS in TMSCRO [30] or the ldquoeliterdquo in GA[31] However the ldquoeliterdquo in GA is usually generated fromtwo chromosomes while super molecule is approached byexecuting the first phase of DRSCRO Moreover in compari-son with TMSCRO DRSCRO uses a metaheuristic strategy(CRO) to get a better super molecule It is because asintelligent random-search algorithm CRO used in the phaseof DRSCRO for super molecule selection searches a widerarea of the solution space than CEFT applied in TMSCROwhich narrow the search down to a very small portion ofthe solution space As a result a better super molecule maycontribute to a better global optimum solution and accel-erates convergence Second DAG scheduling problem hastwo complex aspects including task sequence optimizationand processor assignment optimization which lead to avery large and complicated solution space So for a bettercapability of intensification search than other CRO-basedalgorithms for DAG scheduling on heterogeneous systemsDRSCRO applied VNS algorithm as the initialization of theglobal optimization phase which is also as a local searchprocess during the running of DRSCRO and one of theneighborhood structures of VNS is also utilized in theineffective reaction operator Moreover during the runningof DRSCRO the task priority is changeable in our adoptedVNS algorithm and a new model for processor selection isalso utilized in the neighborhood structures for promotingefficiency of VNS different from the VNS proposed in [22]All of three advantages as previously mentioned enhance theability to get better rapidity of convergence and better searchresult in the whole solution space which is demonstrated bythe experimental results in Sections 53 and 54 The time

complexity of DRSCRO is 119874(iter times (2 times |119881| + 4 times |119864| times |119875| +

119889max times 119904119906119887119901119900119901119873119906119898))

5 Experimental Details

In this section the simulation experiment and comparativeevaluation of HEFT DMSCRO TMSCRO and proposedDRSCRO are presented As presented in [27 30] by theoryanalysis and experimental results TMSCRO and DMSCROproved to have better performance than GA therefore ourwork is the further study of CRO-based algorithms for DAGscheduling on heterogeneous systems and for DRSCRO asa metaheuristic algorithm we focus on the performance ofour proposed algorithm itself and the comparison betweenDRSCRO and other similar kinds of algorithms

First two extensive sets of graphs as the test beds forcomparative study are described Next the parameter settingswhich are used in the simulation experiments are pre-sented The results and analysis of the experiment includingmakespan test and convergence rate test are given in the finalpart

51 Test Bed As presented in [27 30] two extensive setsof DAGs real-world application and randomly generatedapplication graphs are considered as the test beds in theexperiments to enhance the comparability of various algo-rithms The first extensive test bed is two real-world problemDAGs molecular dynamics code [38] and Gaussian elimi-nation [8] Molecular dynamics are a computer simulationof physical movements of the molecules and atoms whichare allowed to interact for a period of time in the contextof N-body simulation Molecular dynamics code DAG isshown in Figure 8 Gaussian elimination is used to calculatethe solution for a linear equation system which is appliedsystematically to convert row operations on a set of linearequations to the upper triangular form As shown in Figure 9the total number of tasks in the Gaussian elimination DAGwith the matrix size of 7 is 27 and the largest task numberat the same level is 6 The reason of the utilization of thesetwo application graphs as a test bed is not only to enhancethe comparability of various algorithms but also to showthe function application of our proposed algorithm as anillustrative demonstration without loss of generality Thesecond extensive test bed for comparative study is the DAGsof random graphs A random graph generator presentedin [39] is implemented to generate random graphs in thesimulation experiment It allows the user to generate a varietyof random graphs with different characteristics such as CCRthe amount of calculation of a task the successor numberof a task and the total number of tasks in a random graphIt is also assumed that all tasks and communication linkshave the same computation cost and communication costrespectively

As shown in Figure 2 the next phase criteria and stoppingcriteria of DRSCRO are that the makespan stays unchangedfor 5000 consecutive iterations in the search loop And thestopping criterion of TMSCRO and DMSCRO is that themakespan remains the same for 10000 consecutive iterations

Mathematical Problems in Engineering 13

Entry

Exit

40

38

36

39

37

3433

27 28

2120

35

31 323029

22 23 24 25 26

321

4 5 6 7

131211 14

8 9 10

15 16 17 18 19

0

Figure 8 A molecular dynamics code DAG

52 Parameter Setting In the experiments a parameter hlis set to represent the heterogeneity level as presented inthe first paragraph of Section 3 It complies with the MHMmodel assumption and results in the fact that speeds of acomputing processor are different for different tasks In doingso the heterogeneity level (1 + hl)(1 minus hl) is equal to thebiggest possible ratio of the best processor speed to the worstprocessor speed for each task hl is set as the value tomake theheterogeneity level 2 unless otherwise specified in this paper

The details of parameter setting are shown in Table 4Theparameters 6ndash12 which are the CRO-based algorithms testedin the simulation are set as presented in [25]

53 Makespan Tests The performance of the proposed algo-rithm is compared with two state-of-the-art CRO-basedscheduling algorithms DMSCRO and TMSCRO and aheuristic algorithm HEFT Each makespan value plotted inthe graphs is the average value of a number of independentruns In the first extensive test bed themakespan is averagedover 10 independent runs (HEFT is run only once as adeterministic algorithm) while in the second extensive test

Exit

0

1 2 3 4 5 6

7

8

13

14

18

19

22

23

25

26

24

20 21

15 16 17

9 10 11 12

Entry

Figure 9 A Gaussian elimination DAG for a matrix of size 7

Table 4 Parameter values for simulation experiment

Parameter ValueCCR 01 02 1 2 5Number of processors 4 8 16 32

hl 0333Successor number of a task in a random graph 1 2 3 4

Total number of tasks in a random graph 10 20 50

InitialKE 1000120579 500120599 10Buffer 200KELossRate 02MoleColl 02PopSize 10

bed the makespan is averaged over 30 different randomgraph running instances Moreover to prove the robustness

14 Mathematical Problems in Engineering

0

20

40

60

80

100

120

140

CCR = 10

Mak

espa

n

HEFTDMSCRO

TMSCRODRSCRO

|P| = 4 |P| = 8 |P| = 16 |P| = 32

Figure 10 Average makespan for the molecular dynamics codeCCR = 10

0

20

40

60

80

100

120

140

CCR = 02

Mak

espa

n

HEFTDMSCRO

TMSCRODRSCRO

|P| = 4 |P| = 8 |P| = 16 |P| = 32

Figure 11 Average makespan for Gaussian elimination CCR = 02

of DRSCRO the best final value achieved in all these runsthe worst final value and the related standard deviation orvariance are also presented

531 Real-World Application Graphs Figures 10ndash13 showthe simulation experiment results of DRSCRO DMSCROTMSCRO and HEFT on the real-world application graphsand Tables 4ndash7 list the detail of the experimental results

As shown in Figures 10 and 11 it can be observed thatthe average makespan decreases as the processor numberincreases The results also show that DRSCRO TMSCROand DMSCRO achieve very similar performance which areall metaheuristic methods It is because according to No-Free-Lunch Theorem all well-designed metaheuristic meth-ods have the same performance on searching for optimalsolutions when averaged over all possible fitness functionsThe TMSCRO andDMSCROused in the simulation are well-designed and taken from the literature Therefore it provedthat DRSCRO developed in our work is also well-designed

A close observation of the results in Tables 10 and 11shows that DRSCRO outperforms TMSCRO and DMSCROon average slightly The reason is only because DRSCRO hasbetter capability of intensification search by applying VNS

01 02 1 2 50

50100150200250300350400450

CCR

Mak

espa

n

HEFTDMSCRO

TMSCRODRSCRO

Figure 12 Average makespan for the molecular dynamics code theprocessors number is 16

01 02 1 2 50

50100150200250300350400450

CCR

Mak

espa

n

HEFTDMSCRO

TMSCRODRSCRO

Figure 13 Average makespan for Gaussian elimination the proces-sors number is 8

and the utilization of one of its neighborhood structuresin the ineffective reaction operator as presented in the lastparagraph of Section 46 Therefore the performance of theaverage results obtained by DRSCRO is better than thatobtained by TMSCRO and DMSCRO when the stoppingcriterion is satisfied Moreover DRSCRO TMSCRO andDMSCRO typically outperform HEFT because they search awider area of the solution space as metaheuristic methodswhile the search of HEFT is narrowed down to a very smallerportion by means of the heuristics

Figures 12 and 13 and the results in Tables 7 and 8show the performance of the experimental results of thesefour algorithms with CCR value increasing It can be seenthat the makespan on average increases with the CCR valueincreasing It is because the heterogeneous processors are inthe idle state for longer as a result of the DAGs becomingmore communication-intensive It also can be observed thatDRSCRO TMSCRO and DMSCRO outperform HEFT andthe advantage becomes more significant with the value ofCCR increasing which suggest that heuristic algorithm likeHEFT has less consistent performance in a wide scheduling

Mathematical Problems in Engineering 15

Table 5 Experiment results for the molecular dynamics code CCR = 10

The number ofprocessors

HEFT(averagemakespan)

DMSCRO(averagemakespan)

TMSCRO(averagemakespan)

DRSCRO(averagemakespan)

DRSCRO(best

makespan)

DRSCRO(worst

makespan)

DRSCRO(variance)

4 120086 117624 116993 116759 114365 119589 24878 120074 116633 116007 115775 113402 118582 246616 120031 116101 115478 115247 112885 118041 245532 119993 115476 114856 114529 112182 117306 2440

Table 6 Experiment results for Gaussian elimination graph CCR = 02

The number ofprocessors

HEFT(averagemakespan)

DMSCRO(averagemakespan)

TMSCRO(averagemakespan)

DRSCRO(averagemakespan)

DRSCRO(best

makespan)

DRSCRO(worst

makespan)

DRSCRO(variance)

4 126852 124252 123585 123337 122042 126327 17698 96952 94965 94455 94266 92333 96551 200816 82356 80668 80235 80074 78433 82015 170632 75630 74080 73682 73535 72027 75317 1566

Table 7 Experiment results for the molecular dynamics code the processors number is 16

The value ofCCR

HEFT(averagemakespan)

DMSCRO(averagemakespan)

TMSCRO(averagemakespan)

DRSCRO(averagemakespan)

DRSCRO(best

makespan)

DRSCRO(worst

makespan)

DRSCRO(variance)

01 111782 109491 108903 108685 106457 111320 231502 112034 109737 109148 108930 106697 111571 23201 120031 116101 115478 115247 112885 118041 24552 160760 157465 156619 156306 153102 158532 30345 414581 406082 403902 403095 400878 404805 2125

Table 8 Experiment results for Gaussian elimination graph the processors number is 8

The value ofCCR

HEFT(averagemakespan)

DMSCRO(averagemakespan)

TMSCRO(averagemakespan)

DRSCRO(averagemakespan)

DRSCRO(best

makespan)

DRSCRO(worst

makespan)

DRSCRO(variance)

01 96612 94632 94124 93935 92010 96212 200102 96952 94965 94455 94266 92333 96551 20081 131094 128407 127717 127462 124849 130552 27152 222635 218071 216900 216467 214194 219550 24565 403600 395327 393204 392418 390260 394083 2069

scenario range and metaheuristic algorithm performs moreeffectively for communication-intensive DAGs

532 Randomly Generated Application Graphs As shownin Figures 14ndash16 randomly generated DAGs are used toevaluate the performance of DRSCRO TMSCRO DMSCROand HEFT in these experiments And the details of theseexperimental results are listed in Tables 9ndash11

Figure 14 shows the performance on the experimentalresults of these four algorithms with the processor numberincreasing As shown in Figure 14 DRSCRO always out-performs TMSCRO DMSCRO and HEFT as the numberof processors increases Figure 15 shows that DMSCRO has

better performance than the other three algorithms as thetask number increases The reasons for these are similarto those explained in the third paragraph of Section 531Figure 16 shows the makespan on average with CCR valuesincreasing It can be seem that the averagemakespan increasesrapidly with the increasing of the value of CCR As shown inFigure 16 the makespan on average increases rapidly whenthe value of CCR rises It is the fact that the DAG becomesmore communication-intensive with CCR increasing whichleads to the processors staying in the idle state for longer

54 Convergence Tests In this section the convergenceexperiments are conducted to show the change of makespan

16 Mathematical Problems in Engineering

Table 9 Experiment results for random graphs under different processor numbers task number is 50 and CCR = 02

The number ofprocessors

HEFT(averagemakespan)

DMSCRO(averagemakespan)

TMSCRO(averagemakespan)

DRSCRO(averagemakespan)

DRSCRO(best

makespan)

DRSCRO(worst

makespan)

DRSCRO(variance)

4 149400 146337 145552 145261 142283 148782 30948 119511 117061 116433 116200 113818 119017 247516 119473 117024 116396 116163 113782 118979 247432 119468 115550 114929 114700 112348 117480 2443

Table 10 Experiment results for random graphs under different task numbers processors number is 32 and CCR = 10

The numberof tasks

HEFT(averagemakespan)

DMSCRO(averagemakespan)

TMSCRO(averagemakespan)

DRSCRO(averagemakespan)

DRSCRO(best

makespan)

DRSCRO(worst

makespan)

DRSCRO(variance)

10 714484 706149 699100 690708 688982 693638 202520 1100918 1089685 1080428 1069082 1066410 1071479 261950 1666373 1662543 1649988 1634230 1633415 1636261 1165

Table 11 Experiment results of DRSCRO for the random graph under different CCRs the task number is 50

Value of CCR Processors number is 4 Processors number is 8 Processors number is 16 Processors number is 3201 145093 116068 116058 11459202 145261 116200 116163 1147001 153248 120228 119511 1185342 194401 166014 164246 1622925 508759 498427 492049 487830

0

50

100

150

CCR = 02

Mak

espa

n

HEFTDMSCRO

TMSCRODRSCRO

|P| = 4 |P| = 8 |P| = 16 |P| = 32

Figure 14 Average makespan for random graphs under differentprocessor numbers task number is 50 and CCR = 02

among DRSCRO TMSCRO and DMSCRO The conver-gence traces and significant tests are to further reveal thedifferences between DRSCRO and the other two algorithmsIn these experiments as suggested in [27] the stoppingcriteria of these three algorithms are that the total runningtime reaches a setting value (eg 180 s) Under the consid-eration of comparability the beginning of the time countingof DRSCRO is set as the start of the global optimizationphase processing In the first extensive test bed themakespanis averaged over 10 independent runs while in the secondextensive test bed the makespan is averaged over 30 differentrandom graph running instances

10 20 500

200400600800

10001200140016001800

The number of tasks

Mak

espa

n

HEFTDMSCRO

TMSCRODRSCRO

Figure 15 Average makespan for random graphs under differenttask numbers processors number is 32 and CCR = 10

541 Convergence Trace The convergence traces ofDRSCRO TMSCRO and DMSCRO for processing themolecular dynamics code and Gaussian elimination areplotted in Figures 17 and 18 respectively Figures 19ndash21show the convergence traces when processing the randomlygenerated DAG sets of which each contains 10 20 and50 tasks respectively As shown in Figures 17ndash21 it canbe observed that the convergence traces of these threealgorithms have obvious differences And the DRSCROconverges faster than the other two algorithms in every caseThe reason for the better rate of convergence of DRSCRO is as

Mathematical Problems in Engineering 17

01 02 1 2 50

100

200

300

400

500

600

CCR

Mak

espa

n

P = 32

P = 16P = 8P = 4

Figure 16 Average makespan of DRSCRO for the random graphunder different CCRs the task number is 50

0 1 2 3 4 5 6 7 8 9134

135

136

137

138

139

140

141

Time (Ms)

Mak

espa

n

DMSCROTMSCRO

DRSCRO

times104

Figure 17 Convergence trace for the molecular dynamics codeCCR = 1 and the number of processors is 16

Time (Ms)0 1 2 3 4 5 6 7 8 9

167

168

169

170

171

172

173

174

Mak

espa

n

DMSCROTMSCRO

DRSCRO

times104

Figure 18 Convergence trace for Gaussian elimination CCR = 02and the number of processors is 8

presented in the last paragraph of Section 46 (ie DRSCROtakes the advantage of its double-reaction structure toobtain a better super molecule for accelerating convergence)Even though the VNS algorithm adds the time cost in eachiteration the enhanced optimization capability of DRSCROalso makes it obtain a better coverage rate than TMSCRO

Time (Ms)0 1 2 3 4 5 6 7

505

51

515

52

525

53

535

54

Mak

espa

n

DMSCROTMSCRO

DRSCRO

times104

Figure 19 Convergence trace for the set of the randomly generatedDAGs with 10 tasks

Time (Ms)0 2 4 6 8 10 12 14

65566

66567

67568

68569

69570

705

Mak

espa

n

DMSCROTMSCRO

DRSCRO

times104

Figure 20 Convergence trace for the set of the randomly generatedDAGs with 20 tasks

Time (Ms)0 05 1 15 2 25 3 35 4 45

179

180

181

182

183

184

185

186

Mak

espa

n

DMSCROTMSCRO

DRSCRO

times105

Figure 21 Convergence trace for the set of the randomly generatedDAGs with 50 tasks

and DMSCRO The simulation experimental results showthat DRSCRO converges faster than TMSCRO by 194 onaverage (by 293 in the best case) and faster than DMSCROby 339 on average (by 412 in the best case)

Moreover the statistical analysis based on the averagevalues achieved is also presented in Section 542 to prove

18 Mathematical Problems in Engineering

Table 12 Results of Friedman tests 120572 = 005

Method Algorithm 119901 value Hypothesis

Friedman testDRSCRO DMSCRO 253119864 minus 02 RejectedDRSCRO TMSCRO 253119864 minus 02 Rejected

DRSCRO DMSCRO TMSCRO 670119864 minus 03 Rejected

Table 13 Results of Quade tests 120572 = 005

Method Algorithm 119901 value Hypothesis

Quade testDRSCRO DMSCRO 132119864 minus 02 RejectedDRSCRO TMSCRO 132119864 minus 02 Rejected

DRSCRO DMSCRO TMSCRO 109119864 minus 03 Rejected

that DRSCRO outperforms the other CRO-based algorithmsfor DAG scheduling from a statistical point of view

542 Significant Tests Statistical analysis is necessary for theaverage coverage rates obtained in all cases by DRSCROTMSCRO and DMSCRO which are metaheuristic methodsin order to find significant differences among these resultsNonparametric tests according to the recommendationsin [40] are specifically considered to be used since theexperimental results may present neither normal distributionnor variance homogeneity Therefore the Friedman test andthe Quade test are applied to check whether significantdifferences exist in the performance between these threealgorithms A significance level 120572 = 005 is used in allstatistical tests

Tables 12 and 13 respectively list the test results of theFriedman test and the Quade test which both reject the nullhypothesis of equivalent performance In both of these twotests our proposedDRSCRO is not only compared against allthe algorithms but also compared against the remaining onesas the control methodThe results in Tables 12 and 13 validatethe significant differences 120572 in the performance of DRSCROTMSCRO and DMSCRO

In sum it could be concluded that DRSCRO which is thecontrol algorithm statistically outperforms the other CRO-based DAG scheduling algorithm on coverage rate with asignificant level of 005

6 Discussion

The experimental results of makespan tests show that theperformance of DRSCRO is very similar to the other similarkinds of metaheuristic algorithms because when averagedover all possible fitness functions each well-designed meta-heuristic algorithm has the same performance for searchingoptimal solutions according to No-Free-Lunch TheoremHowever the proposed DRSCRO can achieve better perfor-mance and find good solutions faster than the other similarkinds of metaheuristic algorithms as the experimental resultsof convergence tests and the reason for it as the analysis inthe last paragraph in Section 46 is that DRSCROhas a bettersuper molecule creation bymetaheuristic method and underthe consideration of the optimization of scheduling orderand processor assignment DRSCRO takes the advantages of

VNS algorithm in the global optimization phase to improvethe optimization capability A load balance neighborhoodstructure is also applied in the ineffective reaction operatorfor a better intensification capability The new processorselection model utilized in the neighborhood structures alsopromotes the efficiency of VNS algorithm

7 Conclusion and Future Study

An algorithm named Double-Reaction-Structured CRO(DRSCRO) is developed for DAG scheduling on heteroge-neous systems in this paper DRSCRO includes two reactionphases one for super molecule selection and another forglobal optimizationThe phase of super molecule selection isused to obtain a super molecule by the metaheuristic methodfor better convergence rate different from other CRO-basedalgorithms forDAG scheduling on heterogeneous systems Inaddition to promote the intersection capability of DRSCROthe VNS algorithm which is with a new model for processorselection utilized in the neighborhood structures is used asthe initialization of global optimization phase and the loadbalance neighborhood structure of VNS is also applied inthe ineffective reaction operator The experimental resultsshow that DRSCRO can also achieve a higher speedup thanthe other CRO-based algorithms as far as we know AndDRSCRO algorithm can also obtain better performance onaveragemakespan in some cases

In future work we will analyze the parameter sensitivityof DRSCRO for promoting its activeness Moreover to makethe proposed algorithmmore practical DRSCROwill be alsoextended to aim at twomain objectives such as (1) minimiza-tion of schedule length (time domain) and (2) minimizationof number of used processors (resource domain)

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

This work is financially supported by the National NaturalScience Foundation of China (Grant no 61462073) and

Mathematical Problems in Engineering 19

the National High Technology Research and DevelopmentProgramof China (863 Program) (Grant no 2015AA020107)

References

[1] T Lewis and H Elrewini Introduction to Parallel ComputingPrentice-Hall New York NY USA 1992

[2] Y-K Kwok and I Ahmad ldquoStatic scheduling algorithms forallocating directed task graphs to multiprocessorsrdquo ACM Com-puting Surveys vol 31 no 4 pp 406ndash471 1999

[3] H Topcuoglu S Hariri and M-Y Wu ldquoPerformance-effectiveand low-complexity task scheduling for heterogeneous comput-ingrdquo IEEE Transactions on Parallel and Distributed Systems vol13 no 3 pp 260ndash274 2002

[4] Y-K Kwok and I Ahmad ldquoDynamic critical-path schedulingan effective technique for allocating task graphs to multiproces-sorsrdquo IEEETransactions on Parallel andDistributed Systems vol7 no 5 pp 506ndash521 1996

[5] F Suter F Desprez and H Casanova ldquoFrom heterogeneoustask scheduling to heterogeneous mixed parallel schedulingrdquo inEuro-Par 2004 Parallel Processing 10th International Euro-ParConference Pisa Italy August 31- September 3 2004 Proceed-ings vol 3149 of Lecture Notes in Computer Science pp 230ndash237Springer Berlin Germany 2004

[6] C-Y Lee J J Hwang Y-C Chow and F D Anger ldquoMultipro-cessor scheduling with interprocessor communication delaysrdquoOperations Research Letters vol 7 no 3 pp 141ndash147 1988

[7] M Maheswaran and H J Siegel ldquoA dynamic matching andscheduling algorithm for heterogeneous computing systemsrdquo inProceedings of the 7th Heterogeneous Computing Workshop pp57ndash69 IEEE Computer Society Orlando Fla USAMarch 1998

[8] M-Y Wu and D D Gajski ldquoHypertool a programming aidformessage-passing systemsrdquo IEEE Transactions on Parallel andDistributed Systems vol 1 no 3 pp 330ndash343 1990

[9] H El-Rewini and T G Lewis ldquoScheduling parallel programtasks onto arbitrary target machinesrdquo Journal of Parallel andDistributed Computing vol 9 no 2 pp 138ndash153 1990

[10] G C Sih and E A Lee ldquoCompile-time scheduling heuristic forinterconnection-constrained heterogeneous processor archi-tecturesrdquo IEEE Transactions on Parallel andDistributed Systemsvol 4 no 2 pp 175ndash187 1993

[11] B Kruatrachue and T Lewis ldquoGrain size determination forparallel processingrdquo IEEE Software vol 5 no 1 pp 23ndash32 1988

[12] M A Al-Mouhamed ldquoLower bound on the number of proces-sors and time for scheduling precedence graphs with communi-cation costsrdquo IEEETransactions on Software Engineering vol 16no 12 pp 1390ndash1401 1990

[13] A Tumeo C Pilato F Ferrandi D Sciuto and P L LanzildquoAnt colony optimization for mapping and scheduling in het-erogeneous multiprocessor systemsrdquo in Proceedings of the Inter-national Conference on Embedded Computer Systems Architec-turesModeling and Simulation (SAMOS rsquo08) pp 142ndash149 IEEEComputer Society Samos Greece June 2008

[14] I Ahmad and M K Dhodhi ldquoMultiprocessor scheduling in agenetic paradigmrdquo Parallel Computing vol 22 no 3 pp 395ndash406 1996

[15] M R Bonyadi and M E Moghaddam ldquoA bipartite geneticalgorithm for multi-processor task schedulingrdquo InternationalJournal of Parallel Programming vol 37 no 5 pp 462ndash487 2009

[16] R C Correa A Ferreira and P Rebreyend ldquoScheduling multi-processor tasks with genetic algorithmsrdquo IEEE Transactions onParallel andDistributed Systems vol 10 no 8 pp 825ndash837 1999

[17] E S H Hou N Ansari and H Ren ldquoGenetic algorithm formultiprocessor schedulingrdquo IEEE Transactions on Parallel andDistributed Systems vol 5 no 2 pp 113ndash120 1994

[18] F A Omara and M M Arafa ldquoGenetic algorithms for taskscheduling problemrdquo Journal of Parallel and Distributed Com-puting vol 70 no 1 pp 13ndash22 2010

[19] A Singh M Sevaux and A Rossi ldquoA hybrid grouping geneticalgorithm for multiprocessor schedulingrdquo in ContemporaryComputing Second International Conference IC3 2009 NoidaIndia August 17ndash19 2009 Proceedings vol 40 of Communica-tions in Computer and Information Science pp 1ndash7 SpringerBerlin Germany 2009

[20] A S Wu H Yu S Jin K-C Lin and G Schiavone ldquoAnincremental genetic algorithm approach to multiprocessorschedulingrdquo IEEE Transactions on Parallel and DistributedSystems vol 15 no 9 pp 824ndash834 2004

[21] Y Wen H Xu and J Yang ldquoA heuristic-based hybrid genetic-variable neighborhood search algorithm for task schedulingin heterogeneous multiprocessor systemrdquo Information Sciencesvol 181 no 3 pp 567ndash581 2011

[22] R Shanmugapriya S Padmavathi and S Shalinie ldquoContentionawareness in task scheduling using Tabu searchrdquo in Proceedingsof the IEEE International Advance Computing Conference pp272ndash277 IEEE Computer Society Patiala India March 2009

[23] S C Porto and C C Ribeiro ldquoA tabu search approach totask scheduling on heterogeneous processors under precedenceconstraintsrdquo International Journal of High Speed Computing vol7 no 1 pp 45ndash72 1995

[24] A V Kalashnikov and V A Kostenko ldquoA parallel algorithm ofsimulated annealing for multiprocessor schedulingrdquo Journal ofComputer and Systems Sciences International vol 47 no 3 pp455ndash463 2008

[25] A Y S Lam and V O K Li ldquoChemical-reaction-inspiredmeta-heuristic for optimizationrdquo IEEE Transactions on EvolutionaryComputation vol 14 no 3 pp 381ndash399 2010

[26] P Choudhury R Kumar and P P Chakrabarti ldquoHybridscheduling of dynamic task graphs with selective duplicationfor multiprocessors under memory and time constraintsrdquo IEEETransactions on Parallel and Distributed Systems vol 19 no 7pp 967ndash980 2008

[27] Y Xu K Li LHe andT K Truong ldquoADAG scheduling schemeon heterogeneous computing systems using double molecularstructure-based chemical reaction optimizationrdquo Journal ofParallel andDistributed Computing vol 73 no 9 pp 1306ndash13222013

[28] J Xu and A Lam ldquoChemical reaction optimization for the gridscheduling problemrdquo in Proceedings of the IEEE InternationalConference on Communications pp 1ndash5 IEEE Computer Soci-ety Cape Town South Africa May 2010

[29] B Varghese G McKee and V Alexandrov ldquoCan agent intel-ligence be used to achieve fault tolerant parallel computingsystemsrdquo Parallel Processing Letters vol 21 no 4 pp 379ndash3962011

[30] Y Jiang Z Shao and Y Guo ldquoA DAG scheduling scheme onheterogeneous computing systems using tuple-based chemicalreaction optimizationrdquo Scientific World Journal vol 2014 Arti-cle ID 404375 23 pages 2014

[31] J Xu AY S Lam and V O K Li ldquoStock portfolio selectionusing chemical reaction optimizationrdquo in Proceedings of theInternational Conference on Operations Research and FinancialEngineering pp 458ndash463 WASET Paris France June 2011

20 Mathematical Problems in Engineering

[32] N Mladenovic and P Hansen ldquoVariable neighborhood searchrdquoComputers amp Operations Research vol 24 no 11 pp 1097ndash11001997

[33] K Li X Tang and K Li ldquoEnergy-efficient stochastic taskscheduling on heterogeneous computing systemsrdquo IEEE Trans-actions on Parallel and Distributed Systems vol 25 no 11 pp2867ndash2876 2014

[34] D HWolpert andWGMacready ldquoNo free lunch theorems foroptimizationrdquo IEEE Transactions on Evolutionary Computationvol 1 no 1 pp 67ndash82 1997

[35] M A Khan ldquoScheduling for heterogeneous Systems usingconstrained critical pathsrdquo Parallel Computing vol 38 no 4-5pp 175ndash193 2012

[36] P Hansen N Mladenovic and J M Perez ldquoVariable neigh-borhood search methods and applicationsrdquo 4OR-A QuarterlyJournal of Operations Research vol 6 no 4 pp 319ndash360 2008

[37] F Glover and G Kochenberger Handbook of MetaheuristicsSpringer New York NY USA 2003

[38] S Kim and J Browne ldquoA general approach to mapping of par-allel computation upon multiprocessor architecturesrdquo in Pro-ceedings of the International Conference on Parallel Processingpp 1ndash8 Pennsylvania State University Pennsylvania State Uni-versity Press University Park Pa USA August 1988

[39] V A F Almeida I M M Vasconcelos J N C Arabe and D AMenasce ldquoUsing random task graphs to investigate the potentialbenefits of heterogeneity in parallel systemsrdquo in Proceedingsof the ACMIEEE Conference on Supercomputing pp 683ndash691IEEE Computer Society Minneapolis Minn USA November1992

[40] S Garcıa A Fernandez J Luengo and F Herrera ldquoAdvancednonparametric tests for multiple comparisons in the design ofexperiments in computational intelligence and data miningexperimental analysis of powerrdquo Information Sciences vol 180no 10 pp 2044ndash2064 2010

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 6: Research Article DRSCRO: A Metaheuristic Algorithm for ...downloads.hindawi.com/journals/mpe/2015/396582.pdf · Research Article DRSCRO: A Metaheuristic Algorithm for Task Scheduling

6 Mathematical Problems in Engineering

The global min pointYes

Intermolecularcollision

Molecule selection (one is chosen)

Molecule selection (two or more are chosen)

Satisfy the criteria of decomposition for super

molecule

Satisfy the criteria

molecule selection

OnWallSMS DecompSMS IntermoleSMS SynthSMS

Fit evaluation

No Yes

YesNo No Yes

No

Start

Initialization of the reaction for super molecule selection

Next phase criteria matched

Stopping criteria matched

Initialization of the reaction for globaloptimalization (InitMoleGOVNS)

Yes

Intermolecularcollision

Molecule selection (oneis chosen)

Molecule selection (two or more are chosen)

Satisfy the criteria of decomposition for global

optimalization

Satisfy the criteria

OnWallGO DecompGO IntermoleGO SynthGO

Fit evaluation

No Yes

YesNo No Yes

No

selection

of synthesis for super

of synthesis for globaloptimalization

Figure 2 Framework of DRSCRO

Mathematical Problems in Engineering 7

(1)makespan = 0(2) for each node V

119896inm = ((V

1 1198911 1199011) (V2 1198912 1199012) (V

|119881| 119891|119881| 119901|119881|)) do

(3) calculate the actual finish time of V119896(ie 119879AFTime(V119896))

(4) if 119898119886119896119890119904119901119886119899 lt 119879AFTime(V119896)(5) updatemakespan(6) makespan = 119879AFTime(V119896)(7) end if(8) end for(9) returnmakespan

Algorithm 1 Fit(m) calculating the fitness value of a molecule and the processor allocation optimization

with a new model for processor selection is adopted asthe initialization of the global optimization phase and itis also utilized as a local search process to promote theintensification capability of DRSCROThere are four kinds ofelementary chemical reaction in DRSCRO on-wall collisiondecomposition intermolecular collision and synthesis Andeach kind of reaction contains two types of operators whichare respectively utilized in two phases of DRSCRO In eachiteration one of the elementary chemical reaction operatorsis performed to generate new molecules and the PEs of thenewly generated molecules (ie the fitness function valuesof the newly generated molecules) will be calculated Inaddition SMole will be tracked and only participates inon-wall ineffective collision and intermolecular ineffectivecollision in the global optimization phase to explore as muchas possible the solution space in its neighborhoods and themain purpose is to prevent the supermolecule from changingdramatically The iteration of each phase repeats until thestopping criteria (or next phase criteria) are met and SMoleand its fitness function value are just the final solutionand makespan (ie global min point) respectively In theimplementations of the experiments in this paper the nextphase criteria and the stop criteria ofDRSCROare set aswhenthere is no makespan improvement after 10000 consecutiveiterations in the search loop

42 Molecular Structure and Fitness Function This subsec-tion presents the encoding of scheduling solutions (ie themolecular structure) and the statement of the fitness functionin DRSCRO

421 Molecular Structure In this paper an atom with threeelements can be denoted as a tuple (V

119894 119891119894 119901119894) and the molec-

ular structure M with an array of tuples can be formulatedas in (8) to represent a solution to the DAG schedulingproblem The order of the tuples in M represents the priorityof each DAG task V

119894with the allocated processor 119901

119894 and V =

(V1 V2 V

|119881|) is a topological sequence of DAG which is

with the hypothetical entry task (with no predecessors) V1and

exit task (with no successors) V|119881| respectively representing

the beginning and end of execution Moreover if tuple A isbefore tuple B and VA is the predecessor of VB in DAG thesecond integer of tuple B 119891B will be 1 and vice versa

m = ((V1 1198911 1199011) (V2 1198912 1199012) (V

|119881| 119891|119881|

119901|119881|

)) (8)

422 Fitness Function Potential energy (PE) is defined asthe fitness function value of the corresponding solutionrepresented by 119878 The overall schedule length of the entireDAG namely makespan is the largest finish time amongall tasks which is equivalent to the actual finish time ofthe exit node in DAG In this paper the goal of DAGscheduling problem by DRSCRO is to obtain the schedulingthat minimizes makespan and ensure that the precedence ofthe tasks is not violated Hence each fitness function value isdefined as

Fit (m) = PEm = 119898119886119896119890119904119901119886119899 (9)

Algorithm 1 presents how to calculate the value of theoptimization fitness function Fit(m)

43 Super Molecule Selection Phase

431 Initialization There are two kinds of initial moleculegenerator one used in the phase of super molecule selectionand the other used in the phase of global optimization togenerate the initial solutions for DRSCRO tomanipulateThetuples of the first moleculem used in the initialization of thephase of super molecule selection are ascendingly orderedby the upward rank value [27] of their V

119894 and element three

119901119894of each tuple is generated by a random perturbation The

upward rank value can be calculated by

Rank (V119894)

= 119882 (V119894)

+ maxV119895isinsuccessor(V119894)

comm (V119894 V119895) + Rank (V

119895)

(10)

A detailed description of the initial molecule generator ofthe super molecule selection phase is given in Algorithm 2For the first input molecule m 119901

119909in each tuple inm is set as

1199011

432 Elementary Chemical Reaction Operators InDRSCROthe operators for super molecule selection just randomlychange 119901

119894of each tuple in a molecule as the intensification

searches or the diversification searches [25] to optimize theprocessor mapping of a solution Figures 3 4 5 and 6respectively show the examples of four operators for super

8 Mathematical Problems in Engineering

(1) 119872119900119897119890119873 = 1(2) while MoleN le PopSize do(3) for each 119901

119894in moleculem to randomly change

(4) change 119901119894randomly

(5) end for(6) generate a new moleculem1015840(7) MoleN =MoleN + 1(8) end while

Algorithm 2 InitMoleSMS(m) generating the initial populationfor the super molecule selection phase

Molecule

New molecule

(1 0 p1)

(1 0 p1)

(2 1 p2)

(2 1 p2)

(4 0 p2) (3 1 p3)

(4 0 p3) (3 1 p1)

Figure 3 Example of OnWallSMS

New molecule 1

New molecule 2

Molecule

(1 0 p3) (2 1 p2)

(4 0 p3)

(4 0 p1) (3 1 p1)

(3 1 p1)(2 1 p2)(1 0 p1)

(1 0 p1) (2 1 p3) (4 0 p3) (3 1 p2)

Figure 4 Example of DecompSMS

molecule selection in which the molecules correspond to theDAG as shown in Figure 1(a) And the white blocks in theseexamples denote the tuples that do not change during thereaction operation calculations

As shown in Figure 3 the operatorOnWallSMS is used togenerate a new molecule m1015840 from a given reaction moleculem for optimization OnWallSMS works as follows (1) Theoperator randomly chooses a tuple (V

119894 119891119894 119901119894) in m (2)

The operator changes 119901119894randomly In the end the operator

New molecule 1

New molecule 2

Molecule 1

Molecule 2

(1 0 p1)

(1 0 p1)

(1 0 p1)

(1 0 p1)

(2 1 p1)

(2 1 p2)

(2 1 p2)

(2 1 p1)

(3 1 p1)

(3 1 p3)

(4 0 p3)

(4 0 p3)

(4 0 p3)

(4 0 p1)

(3 1 p2)

(3 1 p3)

Figure 5 Example of IntermoleSMS

Molecule 1

Molecule 2

New molecule

(4 0 p3)

(4 0 p3)

(2 1 p1)

(2 1 p2)(1 0 p2)

(1 0 p2)

(1 0 p1) (2 1 p3)

(3 1 p3)

(3 1 p1)

(4 0 p3) (3 1 p1)

Figure 6 Example of SynthSMS

generates a new molecule m1015840 from m as an intensificationsearch

As shown in Figure 4 the operator DecompSMS is usedto generate newmoleculesm1015840

1andm1015840

2from a given reaction

moleculem DecompSMS works as follows (1) The operatorgenerates two molecules m1015840

1= m and m1015840

2= m (2)

The operator keeps the tuples in m10158401 which is at the odd

position in m and then changes the remaining 119901119909rsquos of tuples

inm10158401 randomly (3) The operator retains the tuples in m1015840

2

which is at the even position in m and then changes theremaining 119901

119909rsquos of tuples in m1015840

2randomly In the end the

operator generates two new molecules m10158401and m1015840

2from m

as a diversification search

Mathematical Problems in Engineering 9

(1) tempSet = pop set(2) pop subset = 0(3) if pop set is the input of the VNS algorithm for the first time (ie the output of the super molecule selection phase)(4) for each tempS in tempSet except SMole(5) choose a tuple (V

119894 119891119894 119901119894) in tempS where 119891

119894= 0 randomly

(6) generate a random number 119903119899119889 isin (0 1)(7) if rndge 05(8) find the first predecessor vj = Pred(vi) from vi to the begin in molecule tempS(9) interchanged position of (vi f i pi) and (V

119895+1 119891119895+1

119901119895+1

) in molecule tempS(10) update f i 119891119894+1 and 119891

119895+1as defined in the last paragraph of Section 421

(11) end if(12) for each pi in molecule tempS to randomly change(13) change pi randomly(14) end for(15) if Fit(tempS) lt Fit(SMole)(16) SMole = tempS(17) end if(18) end for(19) end if(20) pop subset adds SMole(21) pop subset adds the molecules in tempSet with tuple order different from SMole(22) while |119901119900119901 119904119906119887119904119890119905| = 119901119900119901 119904119906119887119904119890119905 119899119906119898 do(23) pop subset add a molecules in pop set which do not exist in pop subset(24) end while

Algorithm 3 InitVNS(pop set pop subset num) initializing the subset of the population pop set for undergoing the VNS algorithm

As shown in Figure 5 the operator IntermoleSMS isused to generate new molecules m1015840

1and m1015840

2from given

molecules m1and m

2 This operator first uses the steps in

OnWallSMS to generatem10158401fromm

1 and then the operator

generates the other newmoleculem10158402fromm

2in the similar

fashion In the end the operator generates two newmoleculesm10158401andm1015840

2from m

1and m

2as an intensification search

As shown in Figure 6 the operator SynthSMS is usedto generate a new molecule m1015840 from given molecules m

1

and m2 SynthSMS works as follows The operator keeps the

tuples inm1015840 which is at the same position inm1andm

2with

the same 119901119909rsquos and then changes the remaining 119901

119910rsquos in m1015840

randomly As a result the operator generatesm1015840 fromm1and

m2as a diversification search

44 Global Optimization Phase

441 Initialization VNS is utilized by our proposed algo-rithm as the initialization of the global optimization phaseand it is also as a local search process to promote theintensification capability of DRSCRO during the running ofthe whole algorithm

Algorithms 3 and 4 respectively present the subset gener-ator of the phase output and the main steps of the whole VNSalgorithm (ie the initialization of the global optimizationphase) In DRSCRO the VNS algorithm only processes thesubset of the population with the super molecule SMoleafter each iteration in the global optimization phase (theoutput of super molecule selection phase is the input of VNSfor the first time) As presented in Algorithm 3 if the pop set(ie the set of population) is the output of the super molecule

selection phase the tuple orders and 119901119894s of its elements

will be adjusted pop subset is the subset of population andpop subset num is the number of the elements in pop subsetwhich is set as PopSize times 50 in this paper

In Algorithm 4 different from the VNS proposed in [21]the task priority was changeable in theVNS algorithmused inDRSCRO the reason for which is that the unchangeable taskpriority in the VNS reduces its efficiency to obtain a bettersolutionTherefore under the consideration of the optimiza-tion of the scheduling order and the processor assignmentboth the input molecules of VNS can be with differenttuple order (ie task priority) as presented in Algorithm 3in each iteration 119889max is set to 2 as presented in [37] Asthe essential factor of VNS two neighborhood structuresload balance and communication reduction neighborhoodstructures which demonstrate their power in solving DAGscheduling problem on heterogeneous systems as presentedin [21] are adopted by the VNS algorithm in DRSCRO fortheir high efficiency In this paper a new model is alsoproposed for processor selection of these two neighborhoodstructures As presented in [21] there are two intuitionsused to construct the neighborhood structures One isthat balancing load among various processors usually helpsminimizing the makespan especially when most tasks areallocated to only a few processors the other is that reducingcommunication overhead and idle waiting time of processorsalways results in a more effective schedule especially givena relatively high unit communication However there is acontradiction between these two intuitions because reducingcommunication overhead and idle waiting time of processorsalways means that some processors are with most tasks

10 Mathematical Problems in Engineering

(1) pop subset = InitVNS(pop set)(2) Select the set of neighborhood structures 119873119890119894119892ℎ

119889(119889 = 1 2 3 119889max)

(3) for each individualm in the pop subset do(4) d = 1(5) while 119889 lt 119889max do(6) Randomly generate a moleculem

1from the 119889th neighborhood ofm

(7) Apply some local search method withm1as the initial molecule (the local optimum presented bym

2)

(8) If m2is better thanm

(9) m = m2

(10) 119889 = 1(11) else(12) d = d + 1(13) end if(14) end while(15) until the termination condition is satisfied(16) end for(17) execute the combination strategy

Algorithm 4 InitMoleGOVNS(pop set) generating the initial population of the global optimization phase

(1) for each processor 119901119894in the solution 120596 do

(2) Compute Tend(pi)(3) end for(4) Choose the processor 119901max with the largest Tend(119901max)(5) Randomly choose a task Vrandom from exec(119901max)(6) Randomly choose a processor 119901random different from 119901max(7) Reallocate Vrandom to the processor 119901random(8) Encode and reschedule the changed solution 1205961015840(9) return 1205961015840

Algorithm 5 GenNeighborhoodofLoadBalance(120596)

So different from the original ones in [21] we develop anew model for processor selection Let TCload(119901

119894) be all

the task execution cost of processor 119901119894 and TCcomm(119901

119894)

is the communication cost overhead of processor 119901119894as

defined in [21] The values of TCload(119901119894) and TCcomm(119901

119894)

are the tendencies of load balancing and communicationreducing respectively (ie the tendency of task reducing orincreasing) The greater TCload(119901

119894) is the stronger tendency

of reducing tasks on 119901119894is and the greater TCcomm(119901

119894) is the

stronger tendency of increasing tasks on 119901119894is Therefore a

parameter Tend(119901119894) is developed to measure the tendency

with the combination of TCload(119901119894) and TCcomm(119901

119894) as (11)

The neighborhood structure computation processes of loadbalance and communication reduction are as presented inAlgorithms 5 and 6 respectively The proposed model isunder the comprehensive consideration of both intuitionsand can make the VNS algorithm more effective than theoriginal one

Tend (119901119894) =

1

1 + 119890minus(TCload(119901119894)TCcomm(119901119894)) (11)

The VNS algorithm in DRSCRO utilizes the dual termi-nation criteria The termination criterion 1 sets the upperbound of the local search iterations to 20 and the termination

criterion 2 sets the maximum iteration number withoutimprovement to 3 The VNS algorithm will stop if eithercriterion is satisfied To form a new initial population acombination strategy is utilized for combining the currentpopulation and the VNS output after the VNS algorithmoutputs the subset of the population The current populationand the VNS output are first merged and sorted by increasingmakespan then the first PopSize molecules are selected togenerate the new initial population

442 Elementary Chemical Reaction Operators The opera-tors for global optimization not only vary 119901

119894of each tuple

but also interchange the positions of the tuples in a moleculeas the intensification searches or the diversification searches[25] to optimize the whole solution

On-wall ineffective collision (as an intensificationsearch) decomposition (as a diversification search) andsynthesis (as a diversification search) are as presented in [30]and we do not repeat them here to focus on our main workIn [30] the function of the ineffective collision operator issimilar to that of the on-wall ineffective collision operatorTherefore different from [30] amodified ineffective collisionoperator is proposed in this paper and it utilized the loadbalance neighborhood structure used in the VNS mentioned

Mathematical Problems in Engineering 11

(1) for each processor pi in the solution 120596 do(2) Compute Tend(pi)(3) end for(4) Choose the processor 119901min with the smallest Tend(119901min)(5) Set the candidate set cand empty(6) for each task vi in the set exec(119901min) do(7) Compute the set predecessor(vi)(8) Update predecessor(vi) with predecessor(vi) = predecessor(vi) minus exec(119901min)(9) cand = cand + predecessor(vi)(10) end for(11) Randomly choose a task Vrandom from cand(12) Reallocate Vrandom to the processor 119901min(13) Encode and reschedule the changed solution 1205961015840(14) return 1205961015840

Algorithm 6 GenNeighborhoodofCommReduction(120596)

(1) choose randomly a tuple (vi f i pi) inm1where f i = 0

(2) exchange the positions of (vi f i pi) and (V119894minus1119891119894minus1 119901119894minus1)

(3) modify 119891119894minus1 f i and 119891

119894+1inm1as defined in the last paragraph of Section 421

(4) generate a new moleculem10158401= GenLBNeighborhood(m

1)

(5) choose randomly a tuple (vi f i pi) inm2where f j = 0

(6) exchange the positions of (vi f i pi) and (V119895minus1

119891119895minus1

119901119895minus1

)(7) modify 119891

119895minus1 f j and 119891

119895+1inm2as defined in the last paragraph of Section 421

(8) generate a new moleculem10158402= GenLBNeighborhood(m

2)

Algorithm 7 IntermoleGO(m1m2)

Table 3 Execution cost of DAG tasks by each processor

Tasks 1199011

1199012

1199013

v1

7 8 9v2

12 14 16v3

14 15 16v4

13 3 14

before to promote the intensification capability of DRSCROand avoid the function duplication

The operator IntermoleGO (ie the ineffective collisionoperator) is used to generate new molecules m1015840

1and m1015840

2

from given moleculesm1andm

2 This operator first uses the

steps in OnWallGO to generate m10158401from m

1 and then the

operator generate the other newmoleculem10158402fromm

2in the

similar fashion In the end the operator generates two newmoleculesm1015840

1andm1015840

2fromm

1andm

2as an intensification

searchThe detailed executions are presented in Algorithm 7Figure 7 shows the example of the IntermoleGO in which themolecules correspond to the DAG as shown in Figure 1(a)

45 Illustrative Example Consider the example shown inFigure 1(a) Its edges are labeled with the communicationcosts whereas the execution costs are shown in Table 3

Initially the path (V1 V2 V4 V3) is found based on

the upward rank value of each task in DAG and the firstmolecule m = ((V

1 0 119901

1) (V2 1 1199011) (V4 0 119901

1) and (V

3

1 1199011)) can be obtained Algorithm 2 InitMoleSMS is then

executed to generate the initial population with 10 elements(ie PopSize is set as 10) for super molecule selection phaseThe initial population molecules are operated during theiterations in the super molecule selection phase as presentedin the framework of DRSCRO in Section 41 and the supermolecule SMole = ((V

1 0 1199013) (V2 1 1199011) (V4 0 1199012) and (V

3

1 1199011)) can be obtainedIn the global optimization phase Algorithm 4 InitMole-

GOVNS is then executed to generate (or to update) the initialpopulation after each iteration as presented in Section 41The molecules are operated during the iterations in theglobal optimization phase as presented in the framework ofDRSCRO in Section 41 and the global minimalmakespan =40 is finally obtained for which the corresponding solution(ie molecule) is ((V

1 0 1199012) (V4 1 1199012) (V2 0 1199012) and (V

3 1

1199012))

46 Analysis of DRSCRO As a new metaheuristic strategythe CRO-based methods for DAG scheduling which isproposed very recently have demonstrated the capability forsolving this kind of NP-hard optimization problems By ana-lyzing the framework molecular structure chemical reactionoperators and the operational environment in DRSCRO itcan be shown to some extent that DRSCRO scheme has theadvantage of three points in comparison with other CRO-based algorithms for DAG scheduling

12 Mathematical Problems in Engineering

New molecule 1

New molecule 2

Molecule 1

Molecule 2

(1 0 p2) (4 1 p2) (2 0 p3) (3 0 p2)

(3 0 p2)(4 1 p2)(2 1 p2)

(1 0 p1)

(1 0 p2)

(1 0 p1) (4 1 p1) (2 0 p1)

(4 0 p2)(2 1 p2)

(3 1 p1)

(3 0 p1)

Figure 7 Example of IntermoleGO

First to some degree super molecule in DRSCRO issimilar to InitS in TMSCRO [30] or the ldquoeliterdquo in GA[31] However the ldquoeliterdquo in GA is usually generated fromtwo chromosomes while super molecule is approached byexecuting the first phase of DRSCRO Moreover in compari-son with TMSCRO DRSCRO uses a metaheuristic strategy(CRO) to get a better super molecule It is because asintelligent random-search algorithm CRO used in the phaseof DRSCRO for super molecule selection searches a widerarea of the solution space than CEFT applied in TMSCROwhich narrow the search down to a very small portion ofthe solution space As a result a better super molecule maycontribute to a better global optimum solution and accel-erates convergence Second DAG scheduling problem hastwo complex aspects including task sequence optimizationand processor assignment optimization which lead to avery large and complicated solution space So for a bettercapability of intensification search than other CRO-basedalgorithms for DAG scheduling on heterogeneous systemsDRSCRO applied VNS algorithm as the initialization of theglobal optimization phase which is also as a local searchprocess during the running of DRSCRO and one of theneighborhood structures of VNS is also utilized in theineffective reaction operator Moreover during the runningof DRSCRO the task priority is changeable in our adoptedVNS algorithm and a new model for processor selection isalso utilized in the neighborhood structures for promotingefficiency of VNS different from the VNS proposed in [22]All of three advantages as previously mentioned enhance theability to get better rapidity of convergence and better searchresult in the whole solution space which is demonstrated bythe experimental results in Sections 53 and 54 The time

complexity of DRSCRO is 119874(iter times (2 times |119881| + 4 times |119864| times |119875| +

119889max times 119904119906119887119901119900119901119873119906119898))

5 Experimental Details

In this section the simulation experiment and comparativeevaluation of HEFT DMSCRO TMSCRO and proposedDRSCRO are presented As presented in [27 30] by theoryanalysis and experimental results TMSCRO and DMSCROproved to have better performance than GA therefore ourwork is the further study of CRO-based algorithms for DAGscheduling on heterogeneous systems and for DRSCRO asa metaheuristic algorithm we focus on the performance ofour proposed algorithm itself and the comparison betweenDRSCRO and other similar kinds of algorithms

First two extensive sets of graphs as the test beds forcomparative study are described Next the parameter settingswhich are used in the simulation experiments are pre-sented The results and analysis of the experiment includingmakespan test and convergence rate test are given in the finalpart

51 Test Bed As presented in [27 30] two extensive setsof DAGs real-world application and randomly generatedapplication graphs are considered as the test beds in theexperiments to enhance the comparability of various algo-rithms The first extensive test bed is two real-world problemDAGs molecular dynamics code [38] and Gaussian elimi-nation [8] Molecular dynamics are a computer simulationof physical movements of the molecules and atoms whichare allowed to interact for a period of time in the contextof N-body simulation Molecular dynamics code DAG isshown in Figure 8 Gaussian elimination is used to calculatethe solution for a linear equation system which is appliedsystematically to convert row operations on a set of linearequations to the upper triangular form As shown in Figure 9the total number of tasks in the Gaussian elimination DAGwith the matrix size of 7 is 27 and the largest task numberat the same level is 6 The reason of the utilization of thesetwo application graphs as a test bed is not only to enhancethe comparability of various algorithms but also to showthe function application of our proposed algorithm as anillustrative demonstration without loss of generality Thesecond extensive test bed for comparative study is the DAGsof random graphs A random graph generator presentedin [39] is implemented to generate random graphs in thesimulation experiment It allows the user to generate a varietyof random graphs with different characteristics such as CCRthe amount of calculation of a task the successor numberof a task and the total number of tasks in a random graphIt is also assumed that all tasks and communication linkshave the same computation cost and communication costrespectively

As shown in Figure 2 the next phase criteria and stoppingcriteria of DRSCRO are that the makespan stays unchangedfor 5000 consecutive iterations in the search loop And thestopping criterion of TMSCRO and DMSCRO is that themakespan remains the same for 10000 consecutive iterations

Mathematical Problems in Engineering 13

Entry

Exit

40

38

36

39

37

3433

27 28

2120

35

31 323029

22 23 24 25 26

321

4 5 6 7

131211 14

8 9 10

15 16 17 18 19

0

Figure 8 A molecular dynamics code DAG

52 Parameter Setting In the experiments a parameter hlis set to represent the heterogeneity level as presented inthe first paragraph of Section 3 It complies with the MHMmodel assumption and results in the fact that speeds of acomputing processor are different for different tasks In doingso the heterogeneity level (1 + hl)(1 minus hl) is equal to thebiggest possible ratio of the best processor speed to the worstprocessor speed for each task hl is set as the value tomake theheterogeneity level 2 unless otherwise specified in this paper

The details of parameter setting are shown in Table 4Theparameters 6ndash12 which are the CRO-based algorithms testedin the simulation are set as presented in [25]

53 Makespan Tests The performance of the proposed algo-rithm is compared with two state-of-the-art CRO-basedscheduling algorithms DMSCRO and TMSCRO and aheuristic algorithm HEFT Each makespan value plotted inthe graphs is the average value of a number of independentruns In the first extensive test bed themakespan is averagedover 10 independent runs (HEFT is run only once as adeterministic algorithm) while in the second extensive test

Exit

0

1 2 3 4 5 6

7

8

13

14

18

19

22

23

25

26

24

20 21

15 16 17

9 10 11 12

Entry

Figure 9 A Gaussian elimination DAG for a matrix of size 7

Table 4 Parameter values for simulation experiment

Parameter ValueCCR 01 02 1 2 5Number of processors 4 8 16 32

hl 0333Successor number of a task in a random graph 1 2 3 4

Total number of tasks in a random graph 10 20 50

InitialKE 1000120579 500120599 10Buffer 200KELossRate 02MoleColl 02PopSize 10

bed the makespan is averaged over 30 different randomgraph running instances Moreover to prove the robustness

14 Mathematical Problems in Engineering

0

20

40

60

80

100

120

140

CCR = 10

Mak

espa

n

HEFTDMSCRO

TMSCRODRSCRO

|P| = 4 |P| = 8 |P| = 16 |P| = 32

Figure 10 Average makespan for the molecular dynamics codeCCR = 10

0

20

40

60

80

100

120

140

CCR = 02

Mak

espa

n

HEFTDMSCRO

TMSCRODRSCRO

|P| = 4 |P| = 8 |P| = 16 |P| = 32

Figure 11 Average makespan for Gaussian elimination CCR = 02

of DRSCRO the best final value achieved in all these runsthe worst final value and the related standard deviation orvariance are also presented

531 Real-World Application Graphs Figures 10ndash13 showthe simulation experiment results of DRSCRO DMSCROTMSCRO and HEFT on the real-world application graphsand Tables 4ndash7 list the detail of the experimental results

As shown in Figures 10 and 11 it can be observed thatthe average makespan decreases as the processor numberincreases The results also show that DRSCRO TMSCROand DMSCRO achieve very similar performance which areall metaheuristic methods It is because according to No-Free-Lunch Theorem all well-designed metaheuristic meth-ods have the same performance on searching for optimalsolutions when averaged over all possible fitness functionsThe TMSCRO andDMSCROused in the simulation are well-designed and taken from the literature Therefore it provedthat DRSCRO developed in our work is also well-designed

A close observation of the results in Tables 10 and 11shows that DRSCRO outperforms TMSCRO and DMSCROon average slightly The reason is only because DRSCRO hasbetter capability of intensification search by applying VNS

01 02 1 2 50

50100150200250300350400450

CCR

Mak

espa

n

HEFTDMSCRO

TMSCRODRSCRO

Figure 12 Average makespan for the molecular dynamics code theprocessors number is 16

01 02 1 2 50

50100150200250300350400450

CCR

Mak

espa

n

HEFTDMSCRO

TMSCRODRSCRO

Figure 13 Average makespan for Gaussian elimination the proces-sors number is 8

and the utilization of one of its neighborhood structuresin the ineffective reaction operator as presented in the lastparagraph of Section 46 Therefore the performance of theaverage results obtained by DRSCRO is better than thatobtained by TMSCRO and DMSCRO when the stoppingcriterion is satisfied Moreover DRSCRO TMSCRO andDMSCRO typically outperform HEFT because they search awider area of the solution space as metaheuristic methodswhile the search of HEFT is narrowed down to a very smallerportion by means of the heuristics

Figures 12 and 13 and the results in Tables 7 and 8show the performance of the experimental results of thesefour algorithms with CCR value increasing It can be seenthat the makespan on average increases with the CCR valueincreasing It is because the heterogeneous processors are inthe idle state for longer as a result of the DAGs becomingmore communication-intensive It also can be observed thatDRSCRO TMSCRO and DMSCRO outperform HEFT andthe advantage becomes more significant with the value ofCCR increasing which suggest that heuristic algorithm likeHEFT has less consistent performance in a wide scheduling

Mathematical Problems in Engineering 15

Table 5 Experiment results for the molecular dynamics code CCR = 10

The number ofprocessors

HEFT(averagemakespan)

DMSCRO(averagemakespan)

TMSCRO(averagemakespan)

DRSCRO(averagemakespan)

DRSCRO(best

makespan)

DRSCRO(worst

makespan)

DRSCRO(variance)

4 120086 117624 116993 116759 114365 119589 24878 120074 116633 116007 115775 113402 118582 246616 120031 116101 115478 115247 112885 118041 245532 119993 115476 114856 114529 112182 117306 2440

Table 6 Experiment results for Gaussian elimination graph CCR = 02

The number ofprocessors

HEFT(averagemakespan)

DMSCRO(averagemakespan)

TMSCRO(averagemakespan)

DRSCRO(averagemakespan)

DRSCRO(best

makespan)

DRSCRO(worst

makespan)

DRSCRO(variance)

4 126852 124252 123585 123337 122042 126327 17698 96952 94965 94455 94266 92333 96551 200816 82356 80668 80235 80074 78433 82015 170632 75630 74080 73682 73535 72027 75317 1566

Table 7 Experiment results for the molecular dynamics code the processors number is 16

The value ofCCR

HEFT(averagemakespan)

DMSCRO(averagemakespan)

TMSCRO(averagemakespan)

DRSCRO(averagemakespan)

DRSCRO(best

makespan)

DRSCRO(worst

makespan)

DRSCRO(variance)

01 111782 109491 108903 108685 106457 111320 231502 112034 109737 109148 108930 106697 111571 23201 120031 116101 115478 115247 112885 118041 24552 160760 157465 156619 156306 153102 158532 30345 414581 406082 403902 403095 400878 404805 2125

Table 8 Experiment results for Gaussian elimination graph the processors number is 8

The value ofCCR

HEFT(averagemakespan)

DMSCRO(averagemakespan)

TMSCRO(averagemakespan)

DRSCRO(averagemakespan)

DRSCRO(best

makespan)

DRSCRO(worst

makespan)

DRSCRO(variance)

01 96612 94632 94124 93935 92010 96212 200102 96952 94965 94455 94266 92333 96551 20081 131094 128407 127717 127462 124849 130552 27152 222635 218071 216900 216467 214194 219550 24565 403600 395327 393204 392418 390260 394083 2069

scenario range and metaheuristic algorithm performs moreeffectively for communication-intensive DAGs

532 Randomly Generated Application Graphs As shownin Figures 14ndash16 randomly generated DAGs are used toevaluate the performance of DRSCRO TMSCRO DMSCROand HEFT in these experiments And the details of theseexperimental results are listed in Tables 9ndash11

Figure 14 shows the performance on the experimentalresults of these four algorithms with the processor numberincreasing As shown in Figure 14 DRSCRO always out-performs TMSCRO DMSCRO and HEFT as the numberof processors increases Figure 15 shows that DMSCRO has

better performance than the other three algorithms as thetask number increases The reasons for these are similarto those explained in the third paragraph of Section 531Figure 16 shows the makespan on average with CCR valuesincreasing It can be seem that the averagemakespan increasesrapidly with the increasing of the value of CCR As shown inFigure 16 the makespan on average increases rapidly whenthe value of CCR rises It is the fact that the DAG becomesmore communication-intensive with CCR increasing whichleads to the processors staying in the idle state for longer

54 Convergence Tests In this section the convergenceexperiments are conducted to show the change of makespan

16 Mathematical Problems in Engineering

Table 9 Experiment results for random graphs under different processor numbers task number is 50 and CCR = 02

The number ofprocessors

HEFT(averagemakespan)

DMSCRO(averagemakespan)

TMSCRO(averagemakespan)

DRSCRO(averagemakespan)

DRSCRO(best

makespan)

DRSCRO(worst

makespan)

DRSCRO(variance)

4 149400 146337 145552 145261 142283 148782 30948 119511 117061 116433 116200 113818 119017 247516 119473 117024 116396 116163 113782 118979 247432 119468 115550 114929 114700 112348 117480 2443

Table 10 Experiment results for random graphs under different task numbers processors number is 32 and CCR = 10

The numberof tasks

HEFT(averagemakespan)

DMSCRO(averagemakespan)

TMSCRO(averagemakespan)

DRSCRO(averagemakespan)

DRSCRO(best

makespan)

DRSCRO(worst

makespan)

DRSCRO(variance)

10 714484 706149 699100 690708 688982 693638 202520 1100918 1089685 1080428 1069082 1066410 1071479 261950 1666373 1662543 1649988 1634230 1633415 1636261 1165

Table 11 Experiment results of DRSCRO for the random graph under different CCRs the task number is 50

Value of CCR Processors number is 4 Processors number is 8 Processors number is 16 Processors number is 3201 145093 116068 116058 11459202 145261 116200 116163 1147001 153248 120228 119511 1185342 194401 166014 164246 1622925 508759 498427 492049 487830

0

50

100

150

CCR = 02

Mak

espa

n

HEFTDMSCRO

TMSCRODRSCRO

|P| = 4 |P| = 8 |P| = 16 |P| = 32

Figure 14 Average makespan for random graphs under differentprocessor numbers task number is 50 and CCR = 02

among DRSCRO TMSCRO and DMSCRO The conver-gence traces and significant tests are to further reveal thedifferences between DRSCRO and the other two algorithmsIn these experiments as suggested in [27] the stoppingcriteria of these three algorithms are that the total runningtime reaches a setting value (eg 180 s) Under the consid-eration of comparability the beginning of the time countingof DRSCRO is set as the start of the global optimizationphase processing In the first extensive test bed themakespanis averaged over 10 independent runs while in the secondextensive test bed the makespan is averaged over 30 differentrandom graph running instances

10 20 500

200400600800

10001200140016001800

The number of tasks

Mak

espa

n

HEFTDMSCRO

TMSCRODRSCRO

Figure 15 Average makespan for random graphs under differenttask numbers processors number is 32 and CCR = 10

541 Convergence Trace The convergence traces ofDRSCRO TMSCRO and DMSCRO for processing themolecular dynamics code and Gaussian elimination areplotted in Figures 17 and 18 respectively Figures 19ndash21show the convergence traces when processing the randomlygenerated DAG sets of which each contains 10 20 and50 tasks respectively As shown in Figures 17ndash21 it canbe observed that the convergence traces of these threealgorithms have obvious differences And the DRSCROconverges faster than the other two algorithms in every caseThe reason for the better rate of convergence of DRSCRO is as

Mathematical Problems in Engineering 17

01 02 1 2 50

100

200

300

400

500

600

CCR

Mak

espa

n

P = 32

P = 16P = 8P = 4

Figure 16 Average makespan of DRSCRO for the random graphunder different CCRs the task number is 50

0 1 2 3 4 5 6 7 8 9134

135

136

137

138

139

140

141

Time (Ms)

Mak

espa

n

DMSCROTMSCRO

DRSCRO

times104

Figure 17 Convergence trace for the molecular dynamics codeCCR = 1 and the number of processors is 16

Time (Ms)0 1 2 3 4 5 6 7 8 9

167

168

169

170

171

172

173

174

Mak

espa

n

DMSCROTMSCRO

DRSCRO

times104

Figure 18 Convergence trace for Gaussian elimination CCR = 02and the number of processors is 8

presented in the last paragraph of Section 46 (ie DRSCROtakes the advantage of its double-reaction structure toobtain a better super molecule for accelerating convergence)Even though the VNS algorithm adds the time cost in eachiteration the enhanced optimization capability of DRSCROalso makes it obtain a better coverage rate than TMSCRO

Time (Ms)0 1 2 3 4 5 6 7

505

51

515

52

525

53

535

54

Mak

espa

n

DMSCROTMSCRO

DRSCRO

times104

Figure 19 Convergence trace for the set of the randomly generatedDAGs with 10 tasks

Time (Ms)0 2 4 6 8 10 12 14

65566

66567

67568

68569

69570

705

Mak

espa

n

DMSCROTMSCRO

DRSCRO

times104

Figure 20 Convergence trace for the set of the randomly generatedDAGs with 20 tasks

Time (Ms)0 05 1 15 2 25 3 35 4 45

179

180

181

182

183

184

185

186

Mak

espa

n

DMSCROTMSCRO

DRSCRO

times105

Figure 21 Convergence trace for the set of the randomly generatedDAGs with 50 tasks

and DMSCRO The simulation experimental results showthat DRSCRO converges faster than TMSCRO by 194 onaverage (by 293 in the best case) and faster than DMSCROby 339 on average (by 412 in the best case)

Moreover the statistical analysis based on the averagevalues achieved is also presented in Section 542 to prove

18 Mathematical Problems in Engineering

Table 12 Results of Friedman tests 120572 = 005

Method Algorithm 119901 value Hypothesis

Friedman testDRSCRO DMSCRO 253119864 minus 02 RejectedDRSCRO TMSCRO 253119864 minus 02 Rejected

DRSCRO DMSCRO TMSCRO 670119864 minus 03 Rejected

Table 13 Results of Quade tests 120572 = 005

Method Algorithm 119901 value Hypothesis

Quade testDRSCRO DMSCRO 132119864 minus 02 RejectedDRSCRO TMSCRO 132119864 minus 02 Rejected

DRSCRO DMSCRO TMSCRO 109119864 minus 03 Rejected

that DRSCRO outperforms the other CRO-based algorithmsfor DAG scheduling from a statistical point of view

542 Significant Tests Statistical analysis is necessary for theaverage coverage rates obtained in all cases by DRSCROTMSCRO and DMSCRO which are metaheuristic methodsin order to find significant differences among these resultsNonparametric tests according to the recommendationsin [40] are specifically considered to be used since theexperimental results may present neither normal distributionnor variance homogeneity Therefore the Friedman test andthe Quade test are applied to check whether significantdifferences exist in the performance between these threealgorithms A significance level 120572 = 005 is used in allstatistical tests

Tables 12 and 13 respectively list the test results of theFriedman test and the Quade test which both reject the nullhypothesis of equivalent performance In both of these twotests our proposedDRSCRO is not only compared against allthe algorithms but also compared against the remaining onesas the control methodThe results in Tables 12 and 13 validatethe significant differences 120572 in the performance of DRSCROTMSCRO and DMSCRO

In sum it could be concluded that DRSCRO which is thecontrol algorithm statistically outperforms the other CRO-based DAG scheduling algorithm on coverage rate with asignificant level of 005

6 Discussion

The experimental results of makespan tests show that theperformance of DRSCRO is very similar to the other similarkinds of metaheuristic algorithms because when averagedover all possible fitness functions each well-designed meta-heuristic algorithm has the same performance for searchingoptimal solutions according to No-Free-Lunch TheoremHowever the proposed DRSCRO can achieve better perfor-mance and find good solutions faster than the other similarkinds of metaheuristic algorithms as the experimental resultsof convergence tests and the reason for it as the analysis inthe last paragraph in Section 46 is that DRSCROhas a bettersuper molecule creation bymetaheuristic method and underthe consideration of the optimization of scheduling orderand processor assignment DRSCRO takes the advantages of

VNS algorithm in the global optimization phase to improvethe optimization capability A load balance neighborhoodstructure is also applied in the ineffective reaction operatorfor a better intensification capability The new processorselection model utilized in the neighborhood structures alsopromotes the efficiency of VNS algorithm

7 Conclusion and Future Study

An algorithm named Double-Reaction-Structured CRO(DRSCRO) is developed for DAG scheduling on heteroge-neous systems in this paper DRSCRO includes two reactionphases one for super molecule selection and another forglobal optimizationThe phase of super molecule selection isused to obtain a super molecule by the metaheuristic methodfor better convergence rate different from other CRO-basedalgorithms forDAG scheduling on heterogeneous systems Inaddition to promote the intersection capability of DRSCROthe VNS algorithm which is with a new model for processorselection utilized in the neighborhood structures is used asthe initialization of global optimization phase and the loadbalance neighborhood structure of VNS is also applied inthe ineffective reaction operator The experimental resultsshow that DRSCRO can also achieve a higher speedup thanthe other CRO-based algorithms as far as we know AndDRSCRO algorithm can also obtain better performance onaveragemakespan in some cases

In future work we will analyze the parameter sensitivityof DRSCRO for promoting its activeness Moreover to makethe proposed algorithmmore practical DRSCROwill be alsoextended to aim at twomain objectives such as (1) minimiza-tion of schedule length (time domain) and (2) minimizationof number of used processors (resource domain)

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

This work is financially supported by the National NaturalScience Foundation of China (Grant no 61462073) and

Mathematical Problems in Engineering 19

the National High Technology Research and DevelopmentProgramof China (863 Program) (Grant no 2015AA020107)

References

[1] T Lewis and H Elrewini Introduction to Parallel ComputingPrentice-Hall New York NY USA 1992

[2] Y-K Kwok and I Ahmad ldquoStatic scheduling algorithms forallocating directed task graphs to multiprocessorsrdquo ACM Com-puting Surveys vol 31 no 4 pp 406ndash471 1999

[3] H Topcuoglu S Hariri and M-Y Wu ldquoPerformance-effectiveand low-complexity task scheduling for heterogeneous comput-ingrdquo IEEE Transactions on Parallel and Distributed Systems vol13 no 3 pp 260ndash274 2002

[4] Y-K Kwok and I Ahmad ldquoDynamic critical-path schedulingan effective technique for allocating task graphs to multiproces-sorsrdquo IEEETransactions on Parallel andDistributed Systems vol7 no 5 pp 506ndash521 1996

[5] F Suter F Desprez and H Casanova ldquoFrom heterogeneoustask scheduling to heterogeneous mixed parallel schedulingrdquo inEuro-Par 2004 Parallel Processing 10th International Euro-ParConference Pisa Italy August 31- September 3 2004 Proceed-ings vol 3149 of Lecture Notes in Computer Science pp 230ndash237Springer Berlin Germany 2004

[6] C-Y Lee J J Hwang Y-C Chow and F D Anger ldquoMultipro-cessor scheduling with interprocessor communication delaysrdquoOperations Research Letters vol 7 no 3 pp 141ndash147 1988

[7] M Maheswaran and H J Siegel ldquoA dynamic matching andscheduling algorithm for heterogeneous computing systemsrdquo inProceedings of the 7th Heterogeneous Computing Workshop pp57ndash69 IEEE Computer Society Orlando Fla USAMarch 1998

[8] M-Y Wu and D D Gajski ldquoHypertool a programming aidformessage-passing systemsrdquo IEEE Transactions on Parallel andDistributed Systems vol 1 no 3 pp 330ndash343 1990

[9] H El-Rewini and T G Lewis ldquoScheduling parallel programtasks onto arbitrary target machinesrdquo Journal of Parallel andDistributed Computing vol 9 no 2 pp 138ndash153 1990

[10] G C Sih and E A Lee ldquoCompile-time scheduling heuristic forinterconnection-constrained heterogeneous processor archi-tecturesrdquo IEEE Transactions on Parallel andDistributed Systemsvol 4 no 2 pp 175ndash187 1993

[11] B Kruatrachue and T Lewis ldquoGrain size determination forparallel processingrdquo IEEE Software vol 5 no 1 pp 23ndash32 1988

[12] M A Al-Mouhamed ldquoLower bound on the number of proces-sors and time for scheduling precedence graphs with communi-cation costsrdquo IEEETransactions on Software Engineering vol 16no 12 pp 1390ndash1401 1990

[13] A Tumeo C Pilato F Ferrandi D Sciuto and P L LanzildquoAnt colony optimization for mapping and scheduling in het-erogeneous multiprocessor systemsrdquo in Proceedings of the Inter-national Conference on Embedded Computer Systems Architec-turesModeling and Simulation (SAMOS rsquo08) pp 142ndash149 IEEEComputer Society Samos Greece June 2008

[14] I Ahmad and M K Dhodhi ldquoMultiprocessor scheduling in agenetic paradigmrdquo Parallel Computing vol 22 no 3 pp 395ndash406 1996

[15] M R Bonyadi and M E Moghaddam ldquoA bipartite geneticalgorithm for multi-processor task schedulingrdquo InternationalJournal of Parallel Programming vol 37 no 5 pp 462ndash487 2009

[16] R C Correa A Ferreira and P Rebreyend ldquoScheduling multi-processor tasks with genetic algorithmsrdquo IEEE Transactions onParallel andDistributed Systems vol 10 no 8 pp 825ndash837 1999

[17] E S H Hou N Ansari and H Ren ldquoGenetic algorithm formultiprocessor schedulingrdquo IEEE Transactions on Parallel andDistributed Systems vol 5 no 2 pp 113ndash120 1994

[18] F A Omara and M M Arafa ldquoGenetic algorithms for taskscheduling problemrdquo Journal of Parallel and Distributed Com-puting vol 70 no 1 pp 13ndash22 2010

[19] A Singh M Sevaux and A Rossi ldquoA hybrid grouping geneticalgorithm for multiprocessor schedulingrdquo in ContemporaryComputing Second International Conference IC3 2009 NoidaIndia August 17ndash19 2009 Proceedings vol 40 of Communica-tions in Computer and Information Science pp 1ndash7 SpringerBerlin Germany 2009

[20] A S Wu H Yu S Jin K-C Lin and G Schiavone ldquoAnincremental genetic algorithm approach to multiprocessorschedulingrdquo IEEE Transactions on Parallel and DistributedSystems vol 15 no 9 pp 824ndash834 2004

[21] Y Wen H Xu and J Yang ldquoA heuristic-based hybrid genetic-variable neighborhood search algorithm for task schedulingin heterogeneous multiprocessor systemrdquo Information Sciencesvol 181 no 3 pp 567ndash581 2011

[22] R Shanmugapriya S Padmavathi and S Shalinie ldquoContentionawareness in task scheduling using Tabu searchrdquo in Proceedingsof the IEEE International Advance Computing Conference pp272ndash277 IEEE Computer Society Patiala India March 2009

[23] S C Porto and C C Ribeiro ldquoA tabu search approach totask scheduling on heterogeneous processors under precedenceconstraintsrdquo International Journal of High Speed Computing vol7 no 1 pp 45ndash72 1995

[24] A V Kalashnikov and V A Kostenko ldquoA parallel algorithm ofsimulated annealing for multiprocessor schedulingrdquo Journal ofComputer and Systems Sciences International vol 47 no 3 pp455ndash463 2008

[25] A Y S Lam and V O K Li ldquoChemical-reaction-inspiredmeta-heuristic for optimizationrdquo IEEE Transactions on EvolutionaryComputation vol 14 no 3 pp 381ndash399 2010

[26] P Choudhury R Kumar and P P Chakrabarti ldquoHybridscheduling of dynamic task graphs with selective duplicationfor multiprocessors under memory and time constraintsrdquo IEEETransactions on Parallel and Distributed Systems vol 19 no 7pp 967ndash980 2008

[27] Y Xu K Li LHe andT K Truong ldquoADAG scheduling schemeon heterogeneous computing systems using double molecularstructure-based chemical reaction optimizationrdquo Journal ofParallel andDistributed Computing vol 73 no 9 pp 1306ndash13222013

[28] J Xu and A Lam ldquoChemical reaction optimization for the gridscheduling problemrdquo in Proceedings of the IEEE InternationalConference on Communications pp 1ndash5 IEEE Computer Soci-ety Cape Town South Africa May 2010

[29] B Varghese G McKee and V Alexandrov ldquoCan agent intel-ligence be used to achieve fault tolerant parallel computingsystemsrdquo Parallel Processing Letters vol 21 no 4 pp 379ndash3962011

[30] Y Jiang Z Shao and Y Guo ldquoA DAG scheduling scheme onheterogeneous computing systems using tuple-based chemicalreaction optimizationrdquo Scientific World Journal vol 2014 Arti-cle ID 404375 23 pages 2014

[31] J Xu AY S Lam and V O K Li ldquoStock portfolio selectionusing chemical reaction optimizationrdquo in Proceedings of theInternational Conference on Operations Research and FinancialEngineering pp 458ndash463 WASET Paris France June 2011

20 Mathematical Problems in Engineering

[32] N Mladenovic and P Hansen ldquoVariable neighborhood searchrdquoComputers amp Operations Research vol 24 no 11 pp 1097ndash11001997

[33] K Li X Tang and K Li ldquoEnergy-efficient stochastic taskscheduling on heterogeneous computing systemsrdquo IEEE Trans-actions on Parallel and Distributed Systems vol 25 no 11 pp2867ndash2876 2014

[34] D HWolpert andWGMacready ldquoNo free lunch theorems foroptimizationrdquo IEEE Transactions on Evolutionary Computationvol 1 no 1 pp 67ndash82 1997

[35] M A Khan ldquoScheduling for heterogeneous Systems usingconstrained critical pathsrdquo Parallel Computing vol 38 no 4-5pp 175ndash193 2012

[36] P Hansen N Mladenovic and J M Perez ldquoVariable neigh-borhood search methods and applicationsrdquo 4OR-A QuarterlyJournal of Operations Research vol 6 no 4 pp 319ndash360 2008

[37] F Glover and G Kochenberger Handbook of MetaheuristicsSpringer New York NY USA 2003

[38] S Kim and J Browne ldquoA general approach to mapping of par-allel computation upon multiprocessor architecturesrdquo in Pro-ceedings of the International Conference on Parallel Processingpp 1ndash8 Pennsylvania State University Pennsylvania State Uni-versity Press University Park Pa USA August 1988

[39] V A F Almeida I M M Vasconcelos J N C Arabe and D AMenasce ldquoUsing random task graphs to investigate the potentialbenefits of heterogeneity in parallel systemsrdquo in Proceedingsof the ACMIEEE Conference on Supercomputing pp 683ndash691IEEE Computer Society Minneapolis Minn USA November1992

[40] S Garcıa A Fernandez J Luengo and F Herrera ldquoAdvancednonparametric tests for multiple comparisons in the design ofexperiments in computational intelligence and data miningexperimental analysis of powerrdquo Information Sciences vol 180no 10 pp 2044ndash2064 2010

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 7: Research Article DRSCRO: A Metaheuristic Algorithm for ...downloads.hindawi.com/journals/mpe/2015/396582.pdf · Research Article DRSCRO: A Metaheuristic Algorithm for Task Scheduling

Mathematical Problems in Engineering 7

(1)makespan = 0(2) for each node V

119896inm = ((V

1 1198911 1199011) (V2 1198912 1199012) (V

|119881| 119891|119881| 119901|119881|)) do

(3) calculate the actual finish time of V119896(ie 119879AFTime(V119896))

(4) if 119898119886119896119890119904119901119886119899 lt 119879AFTime(V119896)(5) updatemakespan(6) makespan = 119879AFTime(V119896)(7) end if(8) end for(9) returnmakespan

Algorithm 1 Fit(m) calculating the fitness value of a molecule and the processor allocation optimization

with a new model for processor selection is adopted asthe initialization of the global optimization phase and itis also utilized as a local search process to promote theintensification capability of DRSCROThere are four kinds ofelementary chemical reaction in DRSCRO on-wall collisiondecomposition intermolecular collision and synthesis Andeach kind of reaction contains two types of operators whichare respectively utilized in two phases of DRSCRO In eachiteration one of the elementary chemical reaction operatorsis performed to generate new molecules and the PEs of thenewly generated molecules (ie the fitness function valuesof the newly generated molecules) will be calculated Inaddition SMole will be tracked and only participates inon-wall ineffective collision and intermolecular ineffectivecollision in the global optimization phase to explore as muchas possible the solution space in its neighborhoods and themain purpose is to prevent the supermolecule from changingdramatically The iteration of each phase repeats until thestopping criteria (or next phase criteria) are met and SMoleand its fitness function value are just the final solutionand makespan (ie global min point) respectively In theimplementations of the experiments in this paper the nextphase criteria and the stop criteria ofDRSCROare set aswhenthere is no makespan improvement after 10000 consecutiveiterations in the search loop

42 Molecular Structure and Fitness Function This subsec-tion presents the encoding of scheduling solutions (ie themolecular structure) and the statement of the fitness functionin DRSCRO

421 Molecular Structure In this paper an atom with threeelements can be denoted as a tuple (V

119894 119891119894 119901119894) and the molec-

ular structure M with an array of tuples can be formulatedas in (8) to represent a solution to the DAG schedulingproblem The order of the tuples in M represents the priorityof each DAG task V

119894with the allocated processor 119901

119894 and V =

(V1 V2 V

|119881|) is a topological sequence of DAG which is

with the hypothetical entry task (with no predecessors) V1and

exit task (with no successors) V|119881| respectively representing

the beginning and end of execution Moreover if tuple A isbefore tuple B and VA is the predecessor of VB in DAG thesecond integer of tuple B 119891B will be 1 and vice versa

m = ((V1 1198911 1199011) (V2 1198912 1199012) (V

|119881| 119891|119881|

119901|119881|

)) (8)

422 Fitness Function Potential energy (PE) is defined asthe fitness function value of the corresponding solutionrepresented by 119878 The overall schedule length of the entireDAG namely makespan is the largest finish time amongall tasks which is equivalent to the actual finish time ofthe exit node in DAG In this paper the goal of DAGscheduling problem by DRSCRO is to obtain the schedulingthat minimizes makespan and ensure that the precedence ofthe tasks is not violated Hence each fitness function value isdefined as

Fit (m) = PEm = 119898119886119896119890119904119901119886119899 (9)

Algorithm 1 presents how to calculate the value of theoptimization fitness function Fit(m)

43 Super Molecule Selection Phase

431 Initialization There are two kinds of initial moleculegenerator one used in the phase of super molecule selectionand the other used in the phase of global optimization togenerate the initial solutions for DRSCRO tomanipulateThetuples of the first moleculem used in the initialization of thephase of super molecule selection are ascendingly orderedby the upward rank value [27] of their V

119894 and element three

119901119894of each tuple is generated by a random perturbation The

upward rank value can be calculated by

Rank (V119894)

= 119882 (V119894)

+ maxV119895isinsuccessor(V119894)

comm (V119894 V119895) + Rank (V

119895)

(10)

A detailed description of the initial molecule generator ofthe super molecule selection phase is given in Algorithm 2For the first input molecule m 119901

119909in each tuple inm is set as

1199011

432 Elementary Chemical Reaction Operators InDRSCROthe operators for super molecule selection just randomlychange 119901

119894of each tuple in a molecule as the intensification

searches or the diversification searches [25] to optimize theprocessor mapping of a solution Figures 3 4 5 and 6respectively show the examples of four operators for super

8 Mathematical Problems in Engineering

(1) 119872119900119897119890119873 = 1(2) while MoleN le PopSize do(3) for each 119901

119894in moleculem to randomly change

(4) change 119901119894randomly

(5) end for(6) generate a new moleculem1015840(7) MoleN =MoleN + 1(8) end while

Algorithm 2 InitMoleSMS(m) generating the initial populationfor the super molecule selection phase

Molecule

New molecule

(1 0 p1)

(1 0 p1)

(2 1 p2)

(2 1 p2)

(4 0 p2) (3 1 p3)

(4 0 p3) (3 1 p1)

Figure 3 Example of OnWallSMS

New molecule 1

New molecule 2

Molecule

(1 0 p3) (2 1 p2)

(4 0 p3)

(4 0 p1) (3 1 p1)

(3 1 p1)(2 1 p2)(1 0 p1)

(1 0 p1) (2 1 p3) (4 0 p3) (3 1 p2)

Figure 4 Example of DecompSMS

molecule selection in which the molecules correspond to theDAG as shown in Figure 1(a) And the white blocks in theseexamples denote the tuples that do not change during thereaction operation calculations

As shown in Figure 3 the operatorOnWallSMS is used togenerate a new molecule m1015840 from a given reaction moleculem for optimization OnWallSMS works as follows (1) Theoperator randomly chooses a tuple (V

119894 119891119894 119901119894) in m (2)

The operator changes 119901119894randomly In the end the operator

New molecule 1

New molecule 2

Molecule 1

Molecule 2

(1 0 p1)

(1 0 p1)

(1 0 p1)

(1 0 p1)

(2 1 p1)

(2 1 p2)

(2 1 p2)

(2 1 p1)

(3 1 p1)

(3 1 p3)

(4 0 p3)

(4 0 p3)

(4 0 p3)

(4 0 p1)

(3 1 p2)

(3 1 p3)

Figure 5 Example of IntermoleSMS

Molecule 1

Molecule 2

New molecule

(4 0 p3)

(4 0 p3)

(2 1 p1)

(2 1 p2)(1 0 p2)

(1 0 p2)

(1 0 p1) (2 1 p3)

(3 1 p3)

(3 1 p1)

(4 0 p3) (3 1 p1)

Figure 6 Example of SynthSMS

generates a new molecule m1015840 from m as an intensificationsearch

As shown in Figure 4 the operator DecompSMS is usedto generate newmoleculesm1015840

1andm1015840

2from a given reaction

moleculem DecompSMS works as follows (1) The operatorgenerates two molecules m1015840

1= m and m1015840

2= m (2)

The operator keeps the tuples in m10158401 which is at the odd

position in m and then changes the remaining 119901119909rsquos of tuples

inm10158401 randomly (3) The operator retains the tuples in m1015840

2

which is at the even position in m and then changes theremaining 119901

119909rsquos of tuples in m1015840

2randomly In the end the

operator generates two new molecules m10158401and m1015840

2from m

as a diversification search

Mathematical Problems in Engineering 9

(1) tempSet = pop set(2) pop subset = 0(3) if pop set is the input of the VNS algorithm for the first time (ie the output of the super molecule selection phase)(4) for each tempS in tempSet except SMole(5) choose a tuple (V

119894 119891119894 119901119894) in tempS where 119891

119894= 0 randomly

(6) generate a random number 119903119899119889 isin (0 1)(7) if rndge 05(8) find the first predecessor vj = Pred(vi) from vi to the begin in molecule tempS(9) interchanged position of (vi f i pi) and (V

119895+1 119891119895+1

119901119895+1

) in molecule tempS(10) update f i 119891119894+1 and 119891

119895+1as defined in the last paragraph of Section 421

(11) end if(12) for each pi in molecule tempS to randomly change(13) change pi randomly(14) end for(15) if Fit(tempS) lt Fit(SMole)(16) SMole = tempS(17) end if(18) end for(19) end if(20) pop subset adds SMole(21) pop subset adds the molecules in tempSet with tuple order different from SMole(22) while |119901119900119901 119904119906119887119904119890119905| = 119901119900119901 119904119906119887119904119890119905 119899119906119898 do(23) pop subset add a molecules in pop set which do not exist in pop subset(24) end while

Algorithm 3 InitVNS(pop set pop subset num) initializing the subset of the population pop set for undergoing the VNS algorithm

As shown in Figure 5 the operator IntermoleSMS isused to generate new molecules m1015840

1and m1015840

2from given

molecules m1and m

2 This operator first uses the steps in

OnWallSMS to generatem10158401fromm

1 and then the operator

generates the other newmoleculem10158402fromm

2in the similar

fashion In the end the operator generates two newmoleculesm10158401andm1015840

2from m

1and m

2as an intensification search

As shown in Figure 6 the operator SynthSMS is usedto generate a new molecule m1015840 from given molecules m

1

and m2 SynthSMS works as follows The operator keeps the

tuples inm1015840 which is at the same position inm1andm

2with

the same 119901119909rsquos and then changes the remaining 119901

119910rsquos in m1015840

randomly As a result the operator generatesm1015840 fromm1and

m2as a diversification search

44 Global Optimization Phase

441 Initialization VNS is utilized by our proposed algo-rithm as the initialization of the global optimization phaseand it is also as a local search process to promote theintensification capability of DRSCRO during the running ofthe whole algorithm

Algorithms 3 and 4 respectively present the subset gener-ator of the phase output and the main steps of the whole VNSalgorithm (ie the initialization of the global optimizationphase) In DRSCRO the VNS algorithm only processes thesubset of the population with the super molecule SMoleafter each iteration in the global optimization phase (theoutput of super molecule selection phase is the input of VNSfor the first time) As presented in Algorithm 3 if the pop set(ie the set of population) is the output of the super molecule

selection phase the tuple orders and 119901119894s of its elements

will be adjusted pop subset is the subset of population andpop subset num is the number of the elements in pop subsetwhich is set as PopSize times 50 in this paper

In Algorithm 4 different from the VNS proposed in [21]the task priority was changeable in theVNS algorithmused inDRSCRO the reason for which is that the unchangeable taskpriority in the VNS reduces its efficiency to obtain a bettersolutionTherefore under the consideration of the optimiza-tion of the scheduling order and the processor assignmentboth the input molecules of VNS can be with differenttuple order (ie task priority) as presented in Algorithm 3in each iteration 119889max is set to 2 as presented in [37] Asthe essential factor of VNS two neighborhood structuresload balance and communication reduction neighborhoodstructures which demonstrate their power in solving DAGscheduling problem on heterogeneous systems as presentedin [21] are adopted by the VNS algorithm in DRSCRO fortheir high efficiency In this paper a new model is alsoproposed for processor selection of these two neighborhoodstructures As presented in [21] there are two intuitionsused to construct the neighborhood structures One isthat balancing load among various processors usually helpsminimizing the makespan especially when most tasks areallocated to only a few processors the other is that reducingcommunication overhead and idle waiting time of processorsalways results in a more effective schedule especially givena relatively high unit communication However there is acontradiction between these two intuitions because reducingcommunication overhead and idle waiting time of processorsalways means that some processors are with most tasks

10 Mathematical Problems in Engineering

(1) pop subset = InitVNS(pop set)(2) Select the set of neighborhood structures 119873119890119894119892ℎ

119889(119889 = 1 2 3 119889max)

(3) for each individualm in the pop subset do(4) d = 1(5) while 119889 lt 119889max do(6) Randomly generate a moleculem

1from the 119889th neighborhood ofm

(7) Apply some local search method withm1as the initial molecule (the local optimum presented bym

2)

(8) If m2is better thanm

(9) m = m2

(10) 119889 = 1(11) else(12) d = d + 1(13) end if(14) end while(15) until the termination condition is satisfied(16) end for(17) execute the combination strategy

Algorithm 4 InitMoleGOVNS(pop set) generating the initial population of the global optimization phase

(1) for each processor 119901119894in the solution 120596 do

(2) Compute Tend(pi)(3) end for(4) Choose the processor 119901max with the largest Tend(119901max)(5) Randomly choose a task Vrandom from exec(119901max)(6) Randomly choose a processor 119901random different from 119901max(7) Reallocate Vrandom to the processor 119901random(8) Encode and reschedule the changed solution 1205961015840(9) return 1205961015840

Algorithm 5 GenNeighborhoodofLoadBalance(120596)

So different from the original ones in [21] we develop anew model for processor selection Let TCload(119901

119894) be all

the task execution cost of processor 119901119894 and TCcomm(119901

119894)

is the communication cost overhead of processor 119901119894as

defined in [21] The values of TCload(119901119894) and TCcomm(119901

119894)

are the tendencies of load balancing and communicationreducing respectively (ie the tendency of task reducing orincreasing) The greater TCload(119901

119894) is the stronger tendency

of reducing tasks on 119901119894is and the greater TCcomm(119901

119894) is the

stronger tendency of increasing tasks on 119901119894is Therefore a

parameter Tend(119901119894) is developed to measure the tendency

with the combination of TCload(119901119894) and TCcomm(119901

119894) as (11)

The neighborhood structure computation processes of loadbalance and communication reduction are as presented inAlgorithms 5 and 6 respectively The proposed model isunder the comprehensive consideration of both intuitionsand can make the VNS algorithm more effective than theoriginal one

Tend (119901119894) =

1

1 + 119890minus(TCload(119901119894)TCcomm(119901119894)) (11)

The VNS algorithm in DRSCRO utilizes the dual termi-nation criteria The termination criterion 1 sets the upperbound of the local search iterations to 20 and the termination

criterion 2 sets the maximum iteration number withoutimprovement to 3 The VNS algorithm will stop if eithercriterion is satisfied To form a new initial population acombination strategy is utilized for combining the currentpopulation and the VNS output after the VNS algorithmoutputs the subset of the population The current populationand the VNS output are first merged and sorted by increasingmakespan then the first PopSize molecules are selected togenerate the new initial population

442 Elementary Chemical Reaction Operators The opera-tors for global optimization not only vary 119901

119894of each tuple

but also interchange the positions of the tuples in a moleculeas the intensification searches or the diversification searches[25] to optimize the whole solution

On-wall ineffective collision (as an intensificationsearch) decomposition (as a diversification search) andsynthesis (as a diversification search) are as presented in [30]and we do not repeat them here to focus on our main workIn [30] the function of the ineffective collision operator issimilar to that of the on-wall ineffective collision operatorTherefore different from [30] amodified ineffective collisionoperator is proposed in this paper and it utilized the loadbalance neighborhood structure used in the VNS mentioned

Mathematical Problems in Engineering 11

(1) for each processor pi in the solution 120596 do(2) Compute Tend(pi)(3) end for(4) Choose the processor 119901min with the smallest Tend(119901min)(5) Set the candidate set cand empty(6) for each task vi in the set exec(119901min) do(7) Compute the set predecessor(vi)(8) Update predecessor(vi) with predecessor(vi) = predecessor(vi) minus exec(119901min)(9) cand = cand + predecessor(vi)(10) end for(11) Randomly choose a task Vrandom from cand(12) Reallocate Vrandom to the processor 119901min(13) Encode and reschedule the changed solution 1205961015840(14) return 1205961015840

Algorithm 6 GenNeighborhoodofCommReduction(120596)

(1) choose randomly a tuple (vi f i pi) inm1where f i = 0

(2) exchange the positions of (vi f i pi) and (V119894minus1119891119894minus1 119901119894minus1)

(3) modify 119891119894minus1 f i and 119891

119894+1inm1as defined in the last paragraph of Section 421

(4) generate a new moleculem10158401= GenLBNeighborhood(m

1)

(5) choose randomly a tuple (vi f i pi) inm2where f j = 0

(6) exchange the positions of (vi f i pi) and (V119895minus1

119891119895minus1

119901119895minus1

)(7) modify 119891

119895minus1 f j and 119891

119895+1inm2as defined in the last paragraph of Section 421

(8) generate a new moleculem10158402= GenLBNeighborhood(m

2)

Algorithm 7 IntermoleGO(m1m2)

Table 3 Execution cost of DAG tasks by each processor

Tasks 1199011

1199012

1199013

v1

7 8 9v2

12 14 16v3

14 15 16v4

13 3 14

before to promote the intensification capability of DRSCROand avoid the function duplication

The operator IntermoleGO (ie the ineffective collisionoperator) is used to generate new molecules m1015840

1and m1015840

2

from given moleculesm1andm

2 This operator first uses the

steps in OnWallGO to generate m10158401from m

1 and then the

operator generate the other newmoleculem10158402fromm

2in the

similar fashion In the end the operator generates two newmoleculesm1015840

1andm1015840

2fromm

1andm

2as an intensification

searchThe detailed executions are presented in Algorithm 7Figure 7 shows the example of the IntermoleGO in which themolecules correspond to the DAG as shown in Figure 1(a)

45 Illustrative Example Consider the example shown inFigure 1(a) Its edges are labeled with the communicationcosts whereas the execution costs are shown in Table 3

Initially the path (V1 V2 V4 V3) is found based on

the upward rank value of each task in DAG and the firstmolecule m = ((V

1 0 119901

1) (V2 1 1199011) (V4 0 119901

1) and (V

3

1 1199011)) can be obtained Algorithm 2 InitMoleSMS is then

executed to generate the initial population with 10 elements(ie PopSize is set as 10) for super molecule selection phaseThe initial population molecules are operated during theiterations in the super molecule selection phase as presentedin the framework of DRSCRO in Section 41 and the supermolecule SMole = ((V

1 0 1199013) (V2 1 1199011) (V4 0 1199012) and (V

3

1 1199011)) can be obtainedIn the global optimization phase Algorithm 4 InitMole-

GOVNS is then executed to generate (or to update) the initialpopulation after each iteration as presented in Section 41The molecules are operated during the iterations in theglobal optimization phase as presented in the framework ofDRSCRO in Section 41 and the global minimalmakespan =40 is finally obtained for which the corresponding solution(ie molecule) is ((V

1 0 1199012) (V4 1 1199012) (V2 0 1199012) and (V

3 1

1199012))

46 Analysis of DRSCRO As a new metaheuristic strategythe CRO-based methods for DAG scheduling which isproposed very recently have demonstrated the capability forsolving this kind of NP-hard optimization problems By ana-lyzing the framework molecular structure chemical reactionoperators and the operational environment in DRSCRO itcan be shown to some extent that DRSCRO scheme has theadvantage of three points in comparison with other CRO-based algorithms for DAG scheduling

12 Mathematical Problems in Engineering

New molecule 1

New molecule 2

Molecule 1

Molecule 2

(1 0 p2) (4 1 p2) (2 0 p3) (3 0 p2)

(3 0 p2)(4 1 p2)(2 1 p2)

(1 0 p1)

(1 0 p2)

(1 0 p1) (4 1 p1) (2 0 p1)

(4 0 p2)(2 1 p2)

(3 1 p1)

(3 0 p1)

Figure 7 Example of IntermoleGO

First to some degree super molecule in DRSCRO issimilar to InitS in TMSCRO [30] or the ldquoeliterdquo in GA[31] However the ldquoeliterdquo in GA is usually generated fromtwo chromosomes while super molecule is approached byexecuting the first phase of DRSCRO Moreover in compari-son with TMSCRO DRSCRO uses a metaheuristic strategy(CRO) to get a better super molecule It is because asintelligent random-search algorithm CRO used in the phaseof DRSCRO for super molecule selection searches a widerarea of the solution space than CEFT applied in TMSCROwhich narrow the search down to a very small portion ofthe solution space As a result a better super molecule maycontribute to a better global optimum solution and accel-erates convergence Second DAG scheduling problem hastwo complex aspects including task sequence optimizationand processor assignment optimization which lead to avery large and complicated solution space So for a bettercapability of intensification search than other CRO-basedalgorithms for DAG scheduling on heterogeneous systemsDRSCRO applied VNS algorithm as the initialization of theglobal optimization phase which is also as a local searchprocess during the running of DRSCRO and one of theneighborhood structures of VNS is also utilized in theineffective reaction operator Moreover during the runningof DRSCRO the task priority is changeable in our adoptedVNS algorithm and a new model for processor selection isalso utilized in the neighborhood structures for promotingefficiency of VNS different from the VNS proposed in [22]All of three advantages as previously mentioned enhance theability to get better rapidity of convergence and better searchresult in the whole solution space which is demonstrated bythe experimental results in Sections 53 and 54 The time

complexity of DRSCRO is 119874(iter times (2 times |119881| + 4 times |119864| times |119875| +

119889max times 119904119906119887119901119900119901119873119906119898))

5 Experimental Details

In this section the simulation experiment and comparativeevaluation of HEFT DMSCRO TMSCRO and proposedDRSCRO are presented As presented in [27 30] by theoryanalysis and experimental results TMSCRO and DMSCROproved to have better performance than GA therefore ourwork is the further study of CRO-based algorithms for DAGscheduling on heterogeneous systems and for DRSCRO asa metaheuristic algorithm we focus on the performance ofour proposed algorithm itself and the comparison betweenDRSCRO and other similar kinds of algorithms

First two extensive sets of graphs as the test beds forcomparative study are described Next the parameter settingswhich are used in the simulation experiments are pre-sented The results and analysis of the experiment includingmakespan test and convergence rate test are given in the finalpart

51 Test Bed As presented in [27 30] two extensive setsof DAGs real-world application and randomly generatedapplication graphs are considered as the test beds in theexperiments to enhance the comparability of various algo-rithms The first extensive test bed is two real-world problemDAGs molecular dynamics code [38] and Gaussian elimi-nation [8] Molecular dynamics are a computer simulationof physical movements of the molecules and atoms whichare allowed to interact for a period of time in the contextof N-body simulation Molecular dynamics code DAG isshown in Figure 8 Gaussian elimination is used to calculatethe solution for a linear equation system which is appliedsystematically to convert row operations on a set of linearequations to the upper triangular form As shown in Figure 9the total number of tasks in the Gaussian elimination DAGwith the matrix size of 7 is 27 and the largest task numberat the same level is 6 The reason of the utilization of thesetwo application graphs as a test bed is not only to enhancethe comparability of various algorithms but also to showthe function application of our proposed algorithm as anillustrative demonstration without loss of generality Thesecond extensive test bed for comparative study is the DAGsof random graphs A random graph generator presentedin [39] is implemented to generate random graphs in thesimulation experiment It allows the user to generate a varietyof random graphs with different characteristics such as CCRthe amount of calculation of a task the successor numberof a task and the total number of tasks in a random graphIt is also assumed that all tasks and communication linkshave the same computation cost and communication costrespectively

As shown in Figure 2 the next phase criteria and stoppingcriteria of DRSCRO are that the makespan stays unchangedfor 5000 consecutive iterations in the search loop And thestopping criterion of TMSCRO and DMSCRO is that themakespan remains the same for 10000 consecutive iterations

Mathematical Problems in Engineering 13

Entry

Exit

40

38

36

39

37

3433

27 28

2120

35

31 323029

22 23 24 25 26

321

4 5 6 7

131211 14

8 9 10

15 16 17 18 19

0

Figure 8 A molecular dynamics code DAG

52 Parameter Setting In the experiments a parameter hlis set to represent the heterogeneity level as presented inthe first paragraph of Section 3 It complies with the MHMmodel assumption and results in the fact that speeds of acomputing processor are different for different tasks In doingso the heterogeneity level (1 + hl)(1 minus hl) is equal to thebiggest possible ratio of the best processor speed to the worstprocessor speed for each task hl is set as the value tomake theheterogeneity level 2 unless otherwise specified in this paper

The details of parameter setting are shown in Table 4Theparameters 6ndash12 which are the CRO-based algorithms testedin the simulation are set as presented in [25]

53 Makespan Tests The performance of the proposed algo-rithm is compared with two state-of-the-art CRO-basedscheduling algorithms DMSCRO and TMSCRO and aheuristic algorithm HEFT Each makespan value plotted inthe graphs is the average value of a number of independentruns In the first extensive test bed themakespan is averagedover 10 independent runs (HEFT is run only once as adeterministic algorithm) while in the second extensive test

Exit

0

1 2 3 4 5 6

7

8

13

14

18

19

22

23

25

26

24

20 21

15 16 17

9 10 11 12

Entry

Figure 9 A Gaussian elimination DAG for a matrix of size 7

Table 4 Parameter values for simulation experiment

Parameter ValueCCR 01 02 1 2 5Number of processors 4 8 16 32

hl 0333Successor number of a task in a random graph 1 2 3 4

Total number of tasks in a random graph 10 20 50

InitialKE 1000120579 500120599 10Buffer 200KELossRate 02MoleColl 02PopSize 10

bed the makespan is averaged over 30 different randomgraph running instances Moreover to prove the robustness

14 Mathematical Problems in Engineering

0

20

40

60

80

100

120

140

CCR = 10

Mak

espa

n

HEFTDMSCRO

TMSCRODRSCRO

|P| = 4 |P| = 8 |P| = 16 |P| = 32

Figure 10 Average makespan for the molecular dynamics codeCCR = 10

0

20

40

60

80

100

120

140

CCR = 02

Mak

espa

n

HEFTDMSCRO

TMSCRODRSCRO

|P| = 4 |P| = 8 |P| = 16 |P| = 32

Figure 11 Average makespan for Gaussian elimination CCR = 02

of DRSCRO the best final value achieved in all these runsthe worst final value and the related standard deviation orvariance are also presented

531 Real-World Application Graphs Figures 10ndash13 showthe simulation experiment results of DRSCRO DMSCROTMSCRO and HEFT on the real-world application graphsand Tables 4ndash7 list the detail of the experimental results

As shown in Figures 10 and 11 it can be observed thatthe average makespan decreases as the processor numberincreases The results also show that DRSCRO TMSCROand DMSCRO achieve very similar performance which areall metaheuristic methods It is because according to No-Free-Lunch Theorem all well-designed metaheuristic meth-ods have the same performance on searching for optimalsolutions when averaged over all possible fitness functionsThe TMSCRO andDMSCROused in the simulation are well-designed and taken from the literature Therefore it provedthat DRSCRO developed in our work is also well-designed

A close observation of the results in Tables 10 and 11shows that DRSCRO outperforms TMSCRO and DMSCROon average slightly The reason is only because DRSCRO hasbetter capability of intensification search by applying VNS

01 02 1 2 50

50100150200250300350400450

CCR

Mak

espa

n

HEFTDMSCRO

TMSCRODRSCRO

Figure 12 Average makespan for the molecular dynamics code theprocessors number is 16

01 02 1 2 50

50100150200250300350400450

CCR

Mak

espa

n

HEFTDMSCRO

TMSCRODRSCRO

Figure 13 Average makespan for Gaussian elimination the proces-sors number is 8

and the utilization of one of its neighborhood structuresin the ineffective reaction operator as presented in the lastparagraph of Section 46 Therefore the performance of theaverage results obtained by DRSCRO is better than thatobtained by TMSCRO and DMSCRO when the stoppingcriterion is satisfied Moreover DRSCRO TMSCRO andDMSCRO typically outperform HEFT because they search awider area of the solution space as metaheuristic methodswhile the search of HEFT is narrowed down to a very smallerportion by means of the heuristics

Figures 12 and 13 and the results in Tables 7 and 8show the performance of the experimental results of thesefour algorithms with CCR value increasing It can be seenthat the makespan on average increases with the CCR valueincreasing It is because the heterogeneous processors are inthe idle state for longer as a result of the DAGs becomingmore communication-intensive It also can be observed thatDRSCRO TMSCRO and DMSCRO outperform HEFT andthe advantage becomes more significant with the value ofCCR increasing which suggest that heuristic algorithm likeHEFT has less consistent performance in a wide scheduling

Mathematical Problems in Engineering 15

Table 5 Experiment results for the molecular dynamics code CCR = 10

The number ofprocessors

HEFT(averagemakespan)

DMSCRO(averagemakespan)

TMSCRO(averagemakespan)

DRSCRO(averagemakespan)

DRSCRO(best

makespan)

DRSCRO(worst

makespan)

DRSCRO(variance)

4 120086 117624 116993 116759 114365 119589 24878 120074 116633 116007 115775 113402 118582 246616 120031 116101 115478 115247 112885 118041 245532 119993 115476 114856 114529 112182 117306 2440

Table 6 Experiment results for Gaussian elimination graph CCR = 02

The number ofprocessors

HEFT(averagemakespan)

DMSCRO(averagemakespan)

TMSCRO(averagemakespan)

DRSCRO(averagemakespan)

DRSCRO(best

makespan)

DRSCRO(worst

makespan)

DRSCRO(variance)

4 126852 124252 123585 123337 122042 126327 17698 96952 94965 94455 94266 92333 96551 200816 82356 80668 80235 80074 78433 82015 170632 75630 74080 73682 73535 72027 75317 1566

Table 7 Experiment results for the molecular dynamics code the processors number is 16

The value ofCCR

HEFT(averagemakespan)

DMSCRO(averagemakespan)

TMSCRO(averagemakespan)

DRSCRO(averagemakespan)

DRSCRO(best

makespan)

DRSCRO(worst

makespan)

DRSCRO(variance)

01 111782 109491 108903 108685 106457 111320 231502 112034 109737 109148 108930 106697 111571 23201 120031 116101 115478 115247 112885 118041 24552 160760 157465 156619 156306 153102 158532 30345 414581 406082 403902 403095 400878 404805 2125

Table 8 Experiment results for Gaussian elimination graph the processors number is 8

The value ofCCR

HEFT(averagemakespan)

DMSCRO(averagemakespan)

TMSCRO(averagemakespan)

DRSCRO(averagemakespan)

DRSCRO(best

makespan)

DRSCRO(worst

makespan)

DRSCRO(variance)

01 96612 94632 94124 93935 92010 96212 200102 96952 94965 94455 94266 92333 96551 20081 131094 128407 127717 127462 124849 130552 27152 222635 218071 216900 216467 214194 219550 24565 403600 395327 393204 392418 390260 394083 2069

scenario range and metaheuristic algorithm performs moreeffectively for communication-intensive DAGs

532 Randomly Generated Application Graphs As shownin Figures 14ndash16 randomly generated DAGs are used toevaluate the performance of DRSCRO TMSCRO DMSCROand HEFT in these experiments And the details of theseexperimental results are listed in Tables 9ndash11

Figure 14 shows the performance on the experimentalresults of these four algorithms with the processor numberincreasing As shown in Figure 14 DRSCRO always out-performs TMSCRO DMSCRO and HEFT as the numberof processors increases Figure 15 shows that DMSCRO has

better performance than the other three algorithms as thetask number increases The reasons for these are similarto those explained in the third paragraph of Section 531Figure 16 shows the makespan on average with CCR valuesincreasing It can be seem that the averagemakespan increasesrapidly with the increasing of the value of CCR As shown inFigure 16 the makespan on average increases rapidly whenthe value of CCR rises It is the fact that the DAG becomesmore communication-intensive with CCR increasing whichleads to the processors staying in the idle state for longer

54 Convergence Tests In this section the convergenceexperiments are conducted to show the change of makespan

16 Mathematical Problems in Engineering

Table 9 Experiment results for random graphs under different processor numbers task number is 50 and CCR = 02

The number ofprocessors

HEFT(averagemakespan)

DMSCRO(averagemakespan)

TMSCRO(averagemakespan)

DRSCRO(averagemakespan)

DRSCRO(best

makespan)

DRSCRO(worst

makespan)

DRSCRO(variance)

4 149400 146337 145552 145261 142283 148782 30948 119511 117061 116433 116200 113818 119017 247516 119473 117024 116396 116163 113782 118979 247432 119468 115550 114929 114700 112348 117480 2443

Table 10 Experiment results for random graphs under different task numbers processors number is 32 and CCR = 10

The numberof tasks

HEFT(averagemakespan)

DMSCRO(averagemakespan)

TMSCRO(averagemakespan)

DRSCRO(averagemakespan)

DRSCRO(best

makespan)

DRSCRO(worst

makespan)

DRSCRO(variance)

10 714484 706149 699100 690708 688982 693638 202520 1100918 1089685 1080428 1069082 1066410 1071479 261950 1666373 1662543 1649988 1634230 1633415 1636261 1165

Table 11 Experiment results of DRSCRO for the random graph under different CCRs the task number is 50

Value of CCR Processors number is 4 Processors number is 8 Processors number is 16 Processors number is 3201 145093 116068 116058 11459202 145261 116200 116163 1147001 153248 120228 119511 1185342 194401 166014 164246 1622925 508759 498427 492049 487830

0

50

100

150

CCR = 02

Mak

espa

n

HEFTDMSCRO

TMSCRODRSCRO

|P| = 4 |P| = 8 |P| = 16 |P| = 32

Figure 14 Average makespan for random graphs under differentprocessor numbers task number is 50 and CCR = 02

among DRSCRO TMSCRO and DMSCRO The conver-gence traces and significant tests are to further reveal thedifferences between DRSCRO and the other two algorithmsIn these experiments as suggested in [27] the stoppingcriteria of these three algorithms are that the total runningtime reaches a setting value (eg 180 s) Under the consid-eration of comparability the beginning of the time countingof DRSCRO is set as the start of the global optimizationphase processing In the first extensive test bed themakespanis averaged over 10 independent runs while in the secondextensive test bed the makespan is averaged over 30 differentrandom graph running instances

10 20 500

200400600800

10001200140016001800

The number of tasks

Mak

espa

n

HEFTDMSCRO

TMSCRODRSCRO

Figure 15 Average makespan for random graphs under differenttask numbers processors number is 32 and CCR = 10

541 Convergence Trace The convergence traces ofDRSCRO TMSCRO and DMSCRO for processing themolecular dynamics code and Gaussian elimination areplotted in Figures 17 and 18 respectively Figures 19ndash21show the convergence traces when processing the randomlygenerated DAG sets of which each contains 10 20 and50 tasks respectively As shown in Figures 17ndash21 it canbe observed that the convergence traces of these threealgorithms have obvious differences And the DRSCROconverges faster than the other two algorithms in every caseThe reason for the better rate of convergence of DRSCRO is as

Mathematical Problems in Engineering 17

01 02 1 2 50

100

200

300

400

500

600

CCR

Mak

espa

n

P = 32

P = 16P = 8P = 4

Figure 16 Average makespan of DRSCRO for the random graphunder different CCRs the task number is 50

0 1 2 3 4 5 6 7 8 9134

135

136

137

138

139

140

141

Time (Ms)

Mak

espa

n

DMSCROTMSCRO

DRSCRO

times104

Figure 17 Convergence trace for the molecular dynamics codeCCR = 1 and the number of processors is 16

Time (Ms)0 1 2 3 4 5 6 7 8 9

167

168

169

170

171

172

173

174

Mak

espa

n

DMSCROTMSCRO

DRSCRO

times104

Figure 18 Convergence trace for Gaussian elimination CCR = 02and the number of processors is 8

presented in the last paragraph of Section 46 (ie DRSCROtakes the advantage of its double-reaction structure toobtain a better super molecule for accelerating convergence)Even though the VNS algorithm adds the time cost in eachiteration the enhanced optimization capability of DRSCROalso makes it obtain a better coverage rate than TMSCRO

Time (Ms)0 1 2 3 4 5 6 7

505

51

515

52

525

53

535

54

Mak

espa

n

DMSCROTMSCRO

DRSCRO

times104

Figure 19 Convergence trace for the set of the randomly generatedDAGs with 10 tasks

Time (Ms)0 2 4 6 8 10 12 14

65566

66567

67568

68569

69570

705

Mak

espa

n

DMSCROTMSCRO

DRSCRO

times104

Figure 20 Convergence trace for the set of the randomly generatedDAGs with 20 tasks

Time (Ms)0 05 1 15 2 25 3 35 4 45

179

180

181

182

183

184

185

186

Mak

espa

n

DMSCROTMSCRO

DRSCRO

times105

Figure 21 Convergence trace for the set of the randomly generatedDAGs with 50 tasks

and DMSCRO The simulation experimental results showthat DRSCRO converges faster than TMSCRO by 194 onaverage (by 293 in the best case) and faster than DMSCROby 339 on average (by 412 in the best case)

Moreover the statistical analysis based on the averagevalues achieved is also presented in Section 542 to prove

18 Mathematical Problems in Engineering

Table 12 Results of Friedman tests 120572 = 005

Method Algorithm 119901 value Hypothesis

Friedman testDRSCRO DMSCRO 253119864 minus 02 RejectedDRSCRO TMSCRO 253119864 minus 02 Rejected

DRSCRO DMSCRO TMSCRO 670119864 minus 03 Rejected

Table 13 Results of Quade tests 120572 = 005

Method Algorithm 119901 value Hypothesis

Quade testDRSCRO DMSCRO 132119864 minus 02 RejectedDRSCRO TMSCRO 132119864 minus 02 Rejected

DRSCRO DMSCRO TMSCRO 109119864 minus 03 Rejected

that DRSCRO outperforms the other CRO-based algorithmsfor DAG scheduling from a statistical point of view

542 Significant Tests Statistical analysis is necessary for theaverage coverage rates obtained in all cases by DRSCROTMSCRO and DMSCRO which are metaheuristic methodsin order to find significant differences among these resultsNonparametric tests according to the recommendationsin [40] are specifically considered to be used since theexperimental results may present neither normal distributionnor variance homogeneity Therefore the Friedman test andthe Quade test are applied to check whether significantdifferences exist in the performance between these threealgorithms A significance level 120572 = 005 is used in allstatistical tests

Tables 12 and 13 respectively list the test results of theFriedman test and the Quade test which both reject the nullhypothesis of equivalent performance In both of these twotests our proposedDRSCRO is not only compared against allthe algorithms but also compared against the remaining onesas the control methodThe results in Tables 12 and 13 validatethe significant differences 120572 in the performance of DRSCROTMSCRO and DMSCRO

In sum it could be concluded that DRSCRO which is thecontrol algorithm statistically outperforms the other CRO-based DAG scheduling algorithm on coverage rate with asignificant level of 005

6 Discussion

The experimental results of makespan tests show that theperformance of DRSCRO is very similar to the other similarkinds of metaheuristic algorithms because when averagedover all possible fitness functions each well-designed meta-heuristic algorithm has the same performance for searchingoptimal solutions according to No-Free-Lunch TheoremHowever the proposed DRSCRO can achieve better perfor-mance and find good solutions faster than the other similarkinds of metaheuristic algorithms as the experimental resultsof convergence tests and the reason for it as the analysis inthe last paragraph in Section 46 is that DRSCROhas a bettersuper molecule creation bymetaheuristic method and underthe consideration of the optimization of scheduling orderand processor assignment DRSCRO takes the advantages of

VNS algorithm in the global optimization phase to improvethe optimization capability A load balance neighborhoodstructure is also applied in the ineffective reaction operatorfor a better intensification capability The new processorselection model utilized in the neighborhood structures alsopromotes the efficiency of VNS algorithm

7 Conclusion and Future Study

An algorithm named Double-Reaction-Structured CRO(DRSCRO) is developed for DAG scheduling on heteroge-neous systems in this paper DRSCRO includes two reactionphases one for super molecule selection and another forglobal optimizationThe phase of super molecule selection isused to obtain a super molecule by the metaheuristic methodfor better convergence rate different from other CRO-basedalgorithms forDAG scheduling on heterogeneous systems Inaddition to promote the intersection capability of DRSCROthe VNS algorithm which is with a new model for processorselection utilized in the neighborhood structures is used asthe initialization of global optimization phase and the loadbalance neighborhood structure of VNS is also applied inthe ineffective reaction operator The experimental resultsshow that DRSCRO can also achieve a higher speedup thanthe other CRO-based algorithms as far as we know AndDRSCRO algorithm can also obtain better performance onaveragemakespan in some cases

In future work we will analyze the parameter sensitivityof DRSCRO for promoting its activeness Moreover to makethe proposed algorithmmore practical DRSCROwill be alsoextended to aim at twomain objectives such as (1) minimiza-tion of schedule length (time domain) and (2) minimizationof number of used processors (resource domain)

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

This work is financially supported by the National NaturalScience Foundation of China (Grant no 61462073) and

Mathematical Problems in Engineering 19

the National High Technology Research and DevelopmentProgramof China (863 Program) (Grant no 2015AA020107)

References

[1] T Lewis and H Elrewini Introduction to Parallel ComputingPrentice-Hall New York NY USA 1992

[2] Y-K Kwok and I Ahmad ldquoStatic scheduling algorithms forallocating directed task graphs to multiprocessorsrdquo ACM Com-puting Surveys vol 31 no 4 pp 406ndash471 1999

[3] H Topcuoglu S Hariri and M-Y Wu ldquoPerformance-effectiveand low-complexity task scheduling for heterogeneous comput-ingrdquo IEEE Transactions on Parallel and Distributed Systems vol13 no 3 pp 260ndash274 2002

[4] Y-K Kwok and I Ahmad ldquoDynamic critical-path schedulingan effective technique for allocating task graphs to multiproces-sorsrdquo IEEETransactions on Parallel andDistributed Systems vol7 no 5 pp 506ndash521 1996

[5] F Suter F Desprez and H Casanova ldquoFrom heterogeneoustask scheduling to heterogeneous mixed parallel schedulingrdquo inEuro-Par 2004 Parallel Processing 10th International Euro-ParConference Pisa Italy August 31- September 3 2004 Proceed-ings vol 3149 of Lecture Notes in Computer Science pp 230ndash237Springer Berlin Germany 2004

[6] C-Y Lee J J Hwang Y-C Chow and F D Anger ldquoMultipro-cessor scheduling with interprocessor communication delaysrdquoOperations Research Letters vol 7 no 3 pp 141ndash147 1988

[7] M Maheswaran and H J Siegel ldquoA dynamic matching andscheduling algorithm for heterogeneous computing systemsrdquo inProceedings of the 7th Heterogeneous Computing Workshop pp57ndash69 IEEE Computer Society Orlando Fla USAMarch 1998

[8] M-Y Wu and D D Gajski ldquoHypertool a programming aidformessage-passing systemsrdquo IEEE Transactions on Parallel andDistributed Systems vol 1 no 3 pp 330ndash343 1990

[9] H El-Rewini and T G Lewis ldquoScheduling parallel programtasks onto arbitrary target machinesrdquo Journal of Parallel andDistributed Computing vol 9 no 2 pp 138ndash153 1990

[10] G C Sih and E A Lee ldquoCompile-time scheduling heuristic forinterconnection-constrained heterogeneous processor archi-tecturesrdquo IEEE Transactions on Parallel andDistributed Systemsvol 4 no 2 pp 175ndash187 1993

[11] B Kruatrachue and T Lewis ldquoGrain size determination forparallel processingrdquo IEEE Software vol 5 no 1 pp 23ndash32 1988

[12] M A Al-Mouhamed ldquoLower bound on the number of proces-sors and time for scheduling precedence graphs with communi-cation costsrdquo IEEETransactions on Software Engineering vol 16no 12 pp 1390ndash1401 1990

[13] A Tumeo C Pilato F Ferrandi D Sciuto and P L LanzildquoAnt colony optimization for mapping and scheduling in het-erogeneous multiprocessor systemsrdquo in Proceedings of the Inter-national Conference on Embedded Computer Systems Architec-turesModeling and Simulation (SAMOS rsquo08) pp 142ndash149 IEEEComputer Society Samos Greece June 2008

[14] I Ahmad and M K Dhodhi ldquoMultiprocessor scheduling in agenetic paradigmrdquo Parallel Computing vol 22 no 3 pp 395ndash406 1996

[15] M R Bonyadi and M E Moghaddam ldquoA bipartite geneticalgorithm for multi-processor task schedulingrdquo InternationalJournal of Parallel Programming vol 37 no 5 pp 462ndash487 2009

[16] R C Correa A Ferreira and P Rebreyend ldquoScheduling multi-processor tasks with genetic algorithmsrdquo IEEE Transactions onParallel andDistributed Systems vol 10 no 8 pp 825ndash837 1999

[17] E S H Hou N Ansari and H Ren ldquoGenetic algorithm formultiprocessor schedulingrdquo IEEE Transactions on Parallel andDistributed Systems vol 5 no 2 pp 113ndash120 1994

[18] F A Omara and M M Arafa ldquoGenetic algorithms for taskscheduling problemrdquo Journal of Parallel and Distributed Com-puting vol 70 no 1 pp 13ndash22 2010

[19] A Singh M Sevaux and A Rossi ldquoA hybrid grouping geneticalgorithm for multiprocessor schedulingrdquo in ContemporaryComputing Second International Conference IC3 2009 NoidaIndia August 17ndash19 2009 Proceedings vol 40 of Communica-tions in Computer and Information Science pp 1ndash7 SpringerBerlin Germany 2009

[20] A S Wu H Yu S Jin K-C Lin and G Schiavone ldquoAnincremental genetic algorithm approach to multiprocessorschedulingrdquo IEEE Transactions on Parallel and DistributedSystems vol 15 no 9 pp 824ndash834 2004

[21] Y Wen H Xu and J Yang ldquoA heuristic-based hybrid genetic-variable neighborhood search algorithm for task schedulingin heterogeneous multiprocessor systemrdquo Information Sciencesvol 181 no 3 pp 567ndash581 2011

[22] R Shanmugapriya S Padmavathi and S Shalinie ldquoContentionawareness in task scheduling using Tabu searchrdquo in Proceedingsof the IEEE International Advance Computing Conference pp272ndash277 IEEE Computer Society Patiala India March 2009

[23] S C Porto and C C Ribeiro ldquoA tabu search approach totask scheduling on heterogeneous processors under precedenceconstraintsrdquo International Journal of High Speed Computing vol7 no 1 pp 45ndash72 1995

[24] A V Kalashnikov and V A Kostenko ldquoA parallel algorithm ofsimulated annealing for multiprocessor schedulingrdquo Journal ofComputer and Systems Sciences International vol 47 no 3 pp455ndash463 2008

[25] A Y S Lam and V O K Li ldquoChemical-reaction-inspiredmeta-heuristic for optimizationrdquo IEEE Transactions on EvolutionaryComputation vol 14 no 3 pp 381ndash399 2010

[26] P Choudhury R Kumar and P P Chakrabarti ldquoHybridscheduling of dynamic task graphs with selective duplicationfor multiprocessors under memory and time constraintsrdquo IEEETransactions on Parallel and Distributed Systems vol 19 no 7pp 967ndash980 2008

[27] Y Xu K Li LHe andT K Truong ldquoADAG scheduling schemeon heterogeneous computing systems using double molecularstructure-based chemical reaction optimizationrdquo Journal ofParallel andDistributed Computing vol 73 no 9 pp 1306ndash13222013

[28] J Xu and A Lam ldquoChemical reaction optimization for the gridscheduling problemrdquo in Proceedings of the IEEE InternationalConference on Communications pp 1ndash5 IEEE Computer Soci-ety Cape Town South Africa May 2010

[29] B Varghese G McKee and V Alexandrov ldquoCan agent intel-ligence be used to achieve fault tolerant parallel computingsystemsrdquo Parallel Processing Letters vol 21 no 4 pp 379ndash3962011

[30] Y Jiang Z Shao and Y Guo ldquoA DAG scheduling scheme onheterogeneous computing systems using tuple-based chemicalreaction optimizationrdquo Scientific World Journal vol 2014 Arti-cle ID 404375 23 pages 2014

[31] J Xu AY S Lam and V O K Li ldquoStock portfolio selectionusing chemical reaction optimizationrdquo in Proceedings of theInternational Conference on Operations Research and FinancialEngineering pp 458ndash463 WASET Paris France June 2011

20 Mathematical Problems in Engineering

[32] N Mladenovic and P Hansen ldquoVariable neighborhood searchrdquoComputers amp Operations Research vol 24 no 11 pp 1097ndash11001997

[33] K Li X Tang and K Li ldquoEnergy-efficient stochastic taskscheduling on heterogeneous computing systemsrdquo IEEE Trans-actions on Parallel and Distributed Systems vol 25 no 11 pp2867ndash2876 2014

[34] D HWolpert andWGMacready ldquoNo free lunch theorems foroptimizationrdquo IEEE Transactions on Evolutionary Computationvol 1 no 1 pp 67ndash82 1997

[35] M A Khan ldquoScheduling for heterogeneous Systems usingconstrained critical pathsrdquo Parallel Computing vol 38 no 4-5pp 175ndash193 2012

[36] P Hansen N Mladenovic and J M Perez ldquoVariable neigh-borhood search methods and applicationsrdquo 4OR-A QuarterlyJournal of Operations Research vol 6 no 4 pp 319ndash360 2008

[37] F Glover and G Kochenberger Handbook of MetaheuristicsSpringer New York NY USA 2003

[38] S Kim and J Browne ldquoA general approach to mapping of par-allel computation upon multiprocessor architecturesrdquo in Pro-ceedings of the International Conference on Parallel Processingpp 1ndash8 Pennsylvania State University Pennsylvania State Uni-versity Press University Park Pa USA August 1988

[39] V A F Almeida I M M Vasconcelos J N C Arabe and D AMenasce ldquoUsing random task graphs to investigate the potentialbenefits of heterogeneity in parallel systemsrdquo in Proceedingsof the ACMIEEE Conference on Supercomputing pp 683ndash691IEEE Computer Society Minneapolis Minn USA November1992

[40] S Garcıa A Fernandez J Luengo and F Herrera ldquoAdvancednonparametric tests for multiple comparisons in the design ofexperiments in computational intelligence and data miningexperimental analysis of powerrdquo Information Sciences vol 180no 10 pp 2044ndash2064 2010

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 8: Research Article DRSCRO: A Metaheuristic Algorithm for ...downloads.hindawi.com/journals/mpe/2015/396582.pdf · Research Article DRSCRO: A Metaheuristic Algorithm for Task Scheduling

8 Mathematical Problems in Engineering

(1) 119872119900119897119890119873 = 1(2) while MoleN le PopSize do(3) for each 119901

119894in moleculem to randomly change

(4) change 119901119894randomly

(5) end for(6) generate a new moleculem1015840(7) MoleN =MoleN + 1(8) end while

Algorithm 2 InitMoleSMS(m) generating the initial populationfor the super molecule selection phase

Molecule

New molecule

(1 0 p1)

(1 0 p1)

(2 1 p2)

(2 1 p2)

(4 0 p2) (3 1 p3)

(4 0 p3) (3 1 p1)

Figure 3 Example of OnWallSMS

New molecule 1

New molecule 2

Molecule

(1 0 p3) (2 1 p2)

(4 0 p3)

(4 0 p1) (3 1 p1)

(3 1 p1)(2 1 p2)(1 0 p1)

(1 0 p1) (2 1 p3) (4 0 p3) (3 1 p2)

Figure 4 Example of DecompSMS

molecule selection in which the molecules correspond to theDAG as shown in Figure 1(a) And the white blocks in theseexamples denote the tuples that do not change during thereaction operation calculations

As shown in Figure 3 the operatorOnWallSMS is used togenerate a new molecule m1015840 from a given reaction moleculem for optimization OnWallSMS works as follows (1) Theoperator randomly chooses a tuple (V

119894 119891119894 119901119894) in m (2)

The operator changes 119901119894randomly In the end the operator

New molecule 1

New molecule 2

Molecule 1

Molecule 2

(1 0 p1)

(1 0 p1)

(1 0 p1)

(1 0 p1)

(2 1 p1)

(2 1 p2)

(2 1 p2)

(2 1 p1)

(3 1 p1)

(3 1 p3)

(4 0 p3)

(4 0 p3)

(4 0 p3)

(4 0 p1)

(3 1 p2)

(3 1 p3)

Figure 5 Example of IntermoleSMS

Molecule 1

Molecule 2

New molecule

(4 0 p3)

(4 0 p3)

(2 1 p1)

(2 1 p2)(1 0 p2)

(1 0 p2)

(1 0 p1) (2 1 p3)

(3 1 p3)

(3 1 p1)

(4 0 p3) (3 1 p1)

Figure 6 Example of SynthSMS

generates a new molecule m1015840 from m as an intensificationsearch

As shown in Figure 4 the operator DecompSMS is usedto generate newmoleculesm1015840

1andm1015840

2from a given reaction

moleculem DecompSMS works as follows (1) The operatorgenerates two molecules m1015840

1= m and m1015840

2= m (2)

The operator keeps the tuples in m10158401 which is at the odd

position in m and then changes the remaining 119901119909rsquos of tuples

inm10158401 randomly (3) The operator retains the tuples in m1015840

2

which is at the even position in m and then changes theremaining 119901

119909rsquos of tuples in m1015840

2randomly In the end the

operator generates two new molecules m10158401and m1015840

2from m

as a diversification search

Mathematical Problems in Engineering 9

(1) tempSet = pop set(2) pop subset = 0(3) if pop set is the input of the VNS algorithm for the first time (ie the output of the super molecule selection phase)(4) for each tempS in tempSet except SMole(5) choose a tuple (V

119894 119891119894 119901119894) in tempS where 119891

119894= 0 randomly

(6) generate a random number 119903119899119889 isin (0 1)(7) if rndge 05(8) find the first predecessor vj = Pred(vi) from vi to the begin in molecule tempS(9) interchanged position of (vi f i pi) and (V

119895+1 119891119895+1

119901119895+1

) in molecule tempS(10) update f i 119891119894+1 and 119891

119895+1as defined in the last paragraph of Section 421

(11) end if(12) for each pi in molecule tempS to randomly change(13) change pi randomly(14) end for(15) if Fit(tempS) lt Fit(SMole)(16) SMole = tempS(17) end if(18) end for(19) end if(20) pop subset adds SMole(21) pop subset adds the molecules in tempSet with tuple order different from SMole(22) while |119901119900119901 119904119906119887119904119890119905| = 119901119900119901 119904119906119887119904119890119905 119899119906119898 do(23) pop subset add a molecules in pop set which do not exist in pop subset(24) end while

Algorithm 3 InitVNS(pop set pop subset num) initializing the subset of the population pop set for undergoing the VNS algorithm

As shown in Figure 5 the operator IntermoleSMS isused to generate new molecules m1015840

1and m1015840

2from given

molecules m1and m

2 This operator first uses the steps in

OnWallSMS to generatem10158401fromm

1 and then the operator

generates the other newmoleculem10158402fromm

2in the similar

fashion In the end the operator generates two newmoleculesm10158401andm1015840

2from m

1and m

2as an intensification search

As shown in Figure 6 the operator SynthSMS is usedto generate a new molecule m1015840 from given molecules m

1

and m2 SynthSMS works as follows The operator keeps the

tuples inm1015840 which is at the same position inm1andm

2with

the same 119901119909rsquos and then changes the remaining 119901

119910rsquos in m1015840

randomly As a result the operator generatesm1015840 fromm1and

m2as a diversification search

44 Global Optimization Phase

441 Initialization VNS is utilized by our proposed algo-rithm as the initialization of the global optimization phaseand it is also as a local search process to promote theintensification capability of DRSCRO during the running ofthe whole algorithm

Algorithms 3 and 4 respectively present the subset gener-ator of the phase output and the main steps of the whole VNSalgorithm (ie the initialization of the global optimizationphase) In DRSCRO the VNS algorithm only processes thesubset of the population with the super molecule SMoleafter each iteration in the global optimization phase (theoutput of super molecule selection phase is the input of VNSfor the first time) As presented in Algorithm 3 if the pop set(ie the set of population) is the output of the super molecule

selection phase the tuple orders and 119901119894s of its elements

will be adjusted pop subset is the subset of population andpop subset num is the number of the elements in pop subsetwhich is set as PopSize times 50 in this paper

In Algorithm 4 different from the VNS proposed in [21]the task priority was changeable in theVNS algorithmused inDRSCRO the reason for which is that the unchangeable taskpriority in the VNS reduces its efficiency to obtain a bettersolutionTherefore under the consideration of the optimiza-tion of the scheduling order and the processor assignmentboth the input molecules of VNS can be with differenttuple order (ie task priority) as presented in Algorithm 3in each iteration 119889max is set to 2 as presented in [37] Asthe essential factor of VNS two neighborhood structuresload balance and communication reduction neighborhoodstructures which demonstrate their power in solving DAGscheduling problem on heterogeneous systems as presentedin [21] are adopted by the VNS algorithm in DRSCRO fortheir high efficiency In this paper a new model is alsoproposed for processor selection of these two neighborhoodstructures As presented in [21] there are two intuitionsused to construct the neighborhood structures One isthat balancing load among various processors usually helpsminimizing the makespan especially when most tasks areallocated to only a few processors the other is that reducingcommunication overhead and idle waiting time of processorsalways results in a more effective schedule especially givena relatively high unit communication However there is acontradiction between these two intuitions because reducingcommunication overhead and idle waiting time of processorsalways means that some processors are with most tasks

10 Mathematical Problems in Engineering

(1) pop subset = InitVNS(pop set)(2) Select the set of neighborhood structures 119873119890119894119892ℎ

119889(119889 = 1 2 3 119889max)

(3) for each individualm in the pop subset do(4) d = 1(5) while 119889 lt 119889max do(6) Randomly generate a moleculem

1from the 119889th neighborhood ofm

(7) Apply some local search method withm1as the initial molecule (the local optimum presented bym

2)

(8) If m2is better thanm

(9) m = m2

(10) 119889 = 1(11) else(12) d = d + 1(13) end if(14) end while(15) until the termination condition is satisfied(16) end for(17) execute the combination strategy

Algorithm 4 InitMoleGOVNS(pop set) generating the initial population of the global optimization phase

(1) for each processor 119901119894in the solution 120596 do

(2) Compute Tend(pi)(3) end for(4) Choose the processor 119901max with the largest Tend(119901max)(5) Randomly choose a task Vrandom from exec(119901max)(6) Randomly choose a processor 119901random different from 119901max(7) Reallocate Vrandom to the processor 119901random(8) Encode and reschedule the changed solution 1205961015840(9) return 1205961015840

Algorithm 5 GenNeighborhoodofLoadBalance(120596)

So different from the original ones in [21] we develop anew model for processor selection Let TCload(119901

119894) be all

the task execution cost of processor 119901119894 and TCcomm(119901

119894)

is the communication cost overhead of processor 119901119894as

defined in [21] The values of TCload(119901119894) and TCcomm(119901

119894)

are the tendencies of load balancing and communicationreducing respectively (ie the tendency of task reducing orincreasing) The greater TCload(119901

119894) is the stronger tendency

of reducing tasks on 119901119894is and the greater TCcomm(119901

119894) is the

stronger tendency of increasing tasks on 119901119894is Therefore a

parameter Tend(119901119894) is developed to measure the tendency

with the combination of TCload(119901119894) and TCcomm(119901

119894) as (11)

The neighborhood structure computation processes of loadbalance and communication reduction are as presented inAlgorithms 5 and 6 respectively The proposed model isunder the comprehensive consideration of both intuitionsand can make the VNS algorithm more effective than theoriginal one

Tend (119901119894) =

1

1 + 119890minus(TCload(119901119894)TCcomm(119901119894)) (11)

The VNS algorithm in DRSCRO utilizes the dual termi-nation criteria The termination criterion 1 sets the upperbound of the local search iterations to 20 and the termination

criterion 2 sets the maximum iteration number withoutimprovement to 3 The VNS algorithm will stop if eithercriterion is satisfied To form a new initial population acombination strategy is utilized for combining the currentpopulation and the VNS output after the VNS algorithmoutputs the subset of the population The current populationand the VNS output are first merged and sorted by increasingmakespan then the first PopSize molecules are selected togenerate the new initial population

442 Elementary Chemical Reaction Operators The opera-tors for global optimization not only vary 119901

119894of each tuple

but also interchange the positions of the tuples in a moleculeas the intensification searches or the diversification searches[25] to optimize the whole solution

On-wall ineffective collision (as an intensificationsearch) decomposition (as a diversification search) andsynthesis (as a diversification search) are as presented in [30]and we do not repeat them here to focus on our main workIn [30] the function of the ineffective collision operator issimilar to that of the on-wall ineffective collision operatorTherefore different from [30] amodified ineffective collisionoperator is proposed in this paper and it utilized the loadbalance neighborhood structure used in the VNS mentioned

Mathematical Problems in Engineering 11

(1) for each processor pi in the solution 120596 do(2) Compute Tend(pi)(3) end for(4) Choose the processor 119901min with the smallest Tend(119901min)(5) Set the candidate set cand empty(6) for each task vi in the set exec(119901min) do(7) Compute the set predecessor(vi)(8) Update predecessor(vi) with predecessor(vi) = predecessor(vi) minus exec(119901min)(9) cand = cand + predecessor(vi)(10) end for(11) Randomly choose a task Vrandom from cand(12) Reallocate Vrandom to the processor 119901min(13) Encode and reschedule the changed solution 1205961015840(14) return 1205961015840

Algorithm 6 GenNeighborhoodofCommReduction(120596)

(1) choose randomly a tuple (vi f i pi) inm1where f i = 0

(2) exchange the positions of (vi f i pi) and (V119894minus1119891119894minus1 119901119894minus1)

(3) modify 119891119894minus1 f i and 119891

119894+1inm1as defined in the last paragraph of Section 421

(4) generate a new moleculem10158401= GenLBNeighborhood(m

1)

(5) choose randomly a tuple (vi f i pi) inm2where f j = 0

(6) exchange the positions of (vi f i pi) and (V119895minus1

119891119895minus1

119901119895minus1

)(7) modify 119891

119895minus1 f j and 119891

119895+1inm2as defined in the last paragraph of Section 421

(8) generate a new moleculem10158402= GenLBNeighborhood(m

2)

Algorithm 7 IntermoleGO(m1m2)

Table 3 Execution cost of DAG tasks by each processor

Tasks 1199011

1199012

1199013

v1

7 8 9v2

12 14 16v3

14 15 16v4

13 3 14

before to promote the intensification capability of DRSCROand avoid the function duplication

The operator IntermoleGO (ie the ineffective collisionoperator) is used to generate new molecules m1015840

1and m1015840

2

from given moleculesm1andm

2 This operator first uses the

steps in OnWallGO to generate m10158401from m

1 and then the

operator generate the other newmoleculem10158402fromm

2in the

similar fashion In the end the operator generates two newmoleculesm1015840

1andm1015840

2fromm

1andm

2as an intensification

searchThe detailed executions are presented in Algorithm 7Figure 7 shows the example of the IntermoleGO in which themolecules correspond to the DAG as shown in Figure 1(a)

45 Illustrative Example Consider the example shown inFigure 1(a) Its edges are labeled with the communicationcosts whereas the execution costs are shown in Table 3

Initially the path (V1 V2 V4 V3) is found based on

the upward rank value of each task in DAG and the firstmolecule m = ((V

1 0 119901

1) (V2 1 1199011) (V4 0 119901

1) and (V

3

1 1199011)) can be obtained Algorithm 2 InitMoleSMS is then

executed to generate the initial population with 10 elements(ie PopSize is set as 10) for super molecule selection phaseThe initial population molecules are operated during theiterations in the super molecule selection phase as presentedin the framework of DRSCRO in Section 41 and the supermolecule SMole = ((V

1 0 1199013) (V2 1 1199011) (V4 0 1199012) and (V

3

1 1199011)) can be obtainedIn the global optimization phase Algorithm 4 InitMole-

GOVNS is then executed to generate (or to update) the initialpopulation after each iteration as presented in Section 41The molecules are operated during the iterations in theglobal optimization phase as presented in the framework ofDRSCRO in Section 41 and the global minimalmakespan =40 is finally obtained for which the corresponding solution(ie molecule) is ((V

1 0 1199012) (V4 1 1199012) (V2 0 1199012) and (V

3 1

1199012))

46 Analysis of DRSCRO As a new metaheuristic strategythe CRO-based methods for DAG scheduling which isproposed very recently have demonstrated the capability forsolving this kind of NP-hard optimization problems By ana-lyzing the framework molecular structure chemical reactionoperators and the operational environment in DRSCRO itcan be shown to some extent that DRSCRO scheme has theadvantage of three points in comparison with other CRO-based algorithms for DAG scheduling

12 Mathematical Problems in Engineering

New molecule 1

New molecule 2

Molecule 1

Molecule 2

(1 0 p2) (4 1 p2) (2 0 p3) (3 0 p2)

(3 0 p2)(4 1 p2)(2 1 p2)

(1 0 p1)

(1 0 p2)

(1 0 p1) (4 1 p1) (2 0 p1)

(4 0 p2)(2 1 p2)

(3 1 p1)

(3 0 p1)

Figure 7 Example of IntermoleGO

First to some degree super molecule in DRSCRO issimilar to InitS in TMSCRO [30] or the ldquoeliterdquo in GA[31] However the ldquoeliterdquo in GA is usually generated fromtwo chromosomes while super molecule is approached byexecuting the first phase of DRSCRO Moreover in compari-son with TMSCRO DRSCRO uses a metaheuristic strategy(CRO) to get a better super molecule It is because asintelligent random-search algorithm CRO used in the phaseof DRSCRO for super molecule selection searches a widerarea of the solution space than CEFT applied in TMSCROwhich narrow the search down to a very small portion ofthe solution space As a result a better super molecule maycontribute to a better global optimum solution and accel-erates convergence Second DAG scheduling problem hastwo complex aspects including task sequence optimizationand processor assignment optimization which lead to avery large and complicated solution space So for a bettercapability of intensification search than other CRO-basedalgorithms for DAG scheduling on heterogeneous systemsDRSCRO applied VNS algorithm as the initialization of theglobal optimization phase which is also as a local searchprocess during the running of DRSCRO and one of theneighborhood structures of VNS is also utilized in theineffective reaction operator Moreover during the runningof DRSCRO the task priority is changeable in our adoptedVNS algorithm and a new model for processor selection isalso utilized in the neighborhood structures for promotingefficiency of VNS different from the VNS proposed in [22]All of three advantages as previously mentioned enhance theability to get better rapidity of convergence and better searchresult in the whole solution space which is demonstrated bythe experimental results in Sections 53 and 54 The time

complexity of DRSCRO is 119874(iter times (2 times |119881| + 4 times |119864| times |119875| +

119889max times 119904119906119887119901119900119901119873119906119898))

5 Experimental Details

In this section the simulation experiment and comparativeevaluation of HEFT DMSCRO TMSCRO and proposedDRSCRO are presented As presented in [27 30] by theoryanalysis and experimental results TMSCRO and DMSCROproved to have better performance than GA therefore ourwork is the further study of CRO-based algorithms for DAGscheduling on heterogeneous systems and for DRSCRO asa metaheuristic algorithm we focus on the performance ofour proposed algorithm itself and the comparison betweenDRSCRO and other similar kinds of algorithms

First two extensive sets of graphs as the test beds forcomparative study are described Next the parameter settingswhich are used in the simulation experiments are pre-sented The results and analysis of the experiment includingmakespan test and convergence rate test are given in the finalpart

51 Test Bed As presented in [27 30] two extensive setsof DAGs real-world application and randomly generatedapplication graphs are considered as the test beds in theexperiments to enhance the comparability of various algo-rithms The first extensive test bed is two real-world problemDAGs molecular dynamics code [38] and Gaussian elimi-nation [8] Molecular dynamics are a computer simulationof physical movements of the molecules and atoms whichare allowed to interact for a period of time in the contextof N-body simulation Molecular dynamics code DAG isshown in Figure 8 Gaussian elimination is used to calculatethe solution for a linear equation system which is appliedsystematically to convert row operations on a set of linearequations to the upper triangular form As shown in Figure 9the total number of tasks in the Gaussian elimination DAGwith the matrix size of 7 is 27 and the largest task numberat the same level is 6 The reason of the utilization of thesetwo application graphs as a test bed is not only to enhancethe comparability of various algorithms but also to showthe function application of our proposed algorithm as anillustrative demonstration without loss of generality Thesecond extensive test bed for comparative study is the DAGsof random graphs A random graph generator presentedin [39] is implemented to generate random graphs in thesimulation experiment It allows the user to generate a varietyof random graphs with different characteristics such as CCRthe amount of calculation of a task the successor numberof a task and the total number of tasks in a random graphIt is also assumed that all tasks and communication linkshave the same computation cost and communication costrespectively

As shown in Figure 2 the next phase criteria and stoppingcriteria of DRSCRO are that the makespan stays unchangedfor 5000 consecutive iterations in the search loop And thestopping criterion of TMSCRO and DMSCRO is that themakespan remains the same for 10000 consecutive iterations

Mathematical Problems in Engineering 13

Entry

Exit

40

38

36

39

37

3433

27 28

2120

35

31 323029

22 23 24 25 26

321

4 5 6 7

131211 14

8 9 10

15 16 17 18 19

0

Figure 8 A molecular dynamics code DAG

52 Parameter Setting In the experiments a parameter hlis set to represent the heterogeneity level as presented inthe first paragraph of Section 3 It complies with the MHMmodel assumption and results in the fact that speeds of acomputing processor are different for different tasks In doingso the heterogeneity level (1 + hl)(1 minus hl) is equal to thebiggest possible ratio of the best processor speed to the worstprocessor speed for each task hl is set as the value tomake theheterogeneity level 2 unless otherwise specified in this paper

The details of parameter setting are shown in Table 4Theparameters 6ndash12 which are the CRO-based algorithms testedin the simulation are set as presented in [25]

53 Makespan Tests The performance of the proposed algo-rithm is compared with two state-of-the-art CRO-basedscheduling algorithms DMSCRO and TMSCRO and aheuristic algorithm HEFT Each makespan value plotted inthe graphs is the average value of a number of independentruns In the first extensive test bed themakespan is averagedover 10 independent runs (HEFT is run only once as adeterministic algorithm) while in the second extensive test

Exit

0

1 2 3 4 5 6

7

8

13

14

18

19

22

23

25

26

24

20 21

15 16 17

9 10 11 12

Entry

Figure 9 A Gaussian elimination DAG for a matrix of size 7

Table 4 Parameter values for simulation experiment

Parameter ValueCCR 01 02 1 2 5Number of processors 4 8 16 32

hl 0333Successor number of a task in a random graph 1 2 3 4

Total number of tasks in a random graph 10 20 50

InitialKE 1000120579 500120599 10Buffer 200KELossRate 02MoleColl 02PopSize 10

bed the makespan is averaged over 30 different randomgraph running instances Moreover to prove the robustness

14 Mathematical Problems in Engineering

0

20

40

60

80

100

120

140

CCR = 10

Mak

espa

n

HEFTDMSCRO

TMSCRODRSCRO

|P| = 4 |P| = 8 |P| = 16 |P| = 32

Figure 10 Average makespan for the molecular dynamics codeCCR = 10

0

20

40

60

80

100

120

140

CCR = 02

Mak

espa

n

HEFTDMSCRO

TMSCRODRSCRO

|P| = 4 |P| = 8 |P| = 16 |P| = 32

Figure 11 Average makespan for Gaussian elimination CCR = 02

of DRSCRO the best final value achieved in all these runsthe worst final value and the related standard deviation orvariance are also presented

531 Real-World Application Graphs Figures 10ndash13 showthe simulation experiment results of DRSCRO DMSCROTMSCRO and HEFT on the real-world application graphsand Tables 4ndash7 list the detail of the experimental results

As shown in Figures 10 and 11 it can be observed thatthe average makespan decreases as the processor numberincreases The results also show that DRSCRO TMSCROand DMSCRO achieve very similar performance which areall metaheuristic methods It is because according to No-Free-Lunch Theorem all well-designed metaheuristic meth-ods have the same performance on searching for optimalsolutions when averaged over all possible fitness functionsThe TMSCRO andDMSCROused in the simulation are well-designed and taken from the literature Therefore it provedthat DRSCRO developed in our work is also well-designed

A close observation of the results in Tables 10 and 11shows that DRSCRO outperforms TMSCRO and DMSCROon average slightly The reason is only because DRSCRO hasbetter capability of intensification search by applying VNS

01 02 1 2 50

50100150200250300350400450

CCR

Mak

espa

n

HEFTDMSCRO

TMSCRODRSCRO

Figure 12 Average makespan for the molecular dynamics code theprocessors number is 16

01 02 1 2 50

50100150200250300350400450

CCR

Mak

espa

n

HEFTDMSCRO

TMSCRODRSCRO

Figure 13 Average makespan for Gaussian elimination the proces-sors number is 8

and the utilization of one of its neighborhood structuresin the ineffective reaction operator as presented in the lastparagraph of Section 46 Therefore the performance of theaverage results obtained by DRSCRO is better than thatobtained by TMSCRO and DMSCRO when the stoppingcriterion is satisfied Moreover DRSCRO TMSCRO andDMSCRO typically outperform HEFT because they search awider area of the solution space as metaheuristic methodswhile the search of HEFT is narrowed down to a very smallerportion by means of the heuristics

Figures 12 and 13 and the results in Tables 7 and 8show the performance of the experimental results of thesefour algorithms with CCR value increasing It can be seenthat the makespan on average increases with the CCR valueincreasing It is because the heterogeneous processors are inthe idle state for longer as a result of the DAGs becomingmore communication-intensive It also can be observed thatDRSCRO TMSCRO and DMSCRO outperform HEFT andthe advantage becomes more significant with the value ofCCR increasing which suggest that heuristic algorithm likeHEFT has less consistent performance in a wide scheduling

Mathematical Problems in Engineering 15

Table 5 Experiment results for the molecular dynamics code CCR = 10

The number ofprocessors

HEFT(averagemakespan)

DMSCRO(averagemakespan)

TMSCRO(averagemakespan)

DRSCRO(averagemakespan)

DRSCRO(best

makespan)

DRSCRO(worst

makespan)

DRSCRO(variance)

4 120086 117624 116993 116759 114365 119589 24878 120074 116633 116007 115775 113402 118582 246616 120031 116101 115478 115247 112885 118041 245532 119993 115476 114856 114529 112182 117306 2440

Table 6 Experiment results for Gaussian elimination graph CCR = 02

The number ofprocessors

HEFT(averagemakespan)

DMSCRO(averagemakespan)

TMSCRO(averagemakespan)

DRSCRO(averagemakespan)

DRSCRO(best

makespan)

DRSCRO(worst

makespan)

DRSCRO(variance)

4 126852 124252 123585 123337 122042 126327 17698 96952 94965 94455 94266 92333 96551 200816 82356 80668 80235 80074 78433 82015 170632 75630 74080 73682 73535 72027 75317 1566

Table 7 Experiment results for the molecular dynamics code the processors number is 16

The value ofCCR

HEFT(averagemakespan)

DMSCRO(averagemakespan)

TMSCRO(averagemakespan)

DRSCRO(averagemakespan)

DRSCRO(best

makespan)

DRSCRO(worst

makespan)

DRSCRO(variance)

01 111782 109491 108903 108685 106457 111320 231502 112034 109737 109148 108930 106697 111571 23201 120031 116101 115478 115247 112885 118041 24552 160760 157465 156619 156306 153102 158532 30345 414581 406082 403902 403095 400878 404805 2125

Table 8 Experiment results for Gaussian elimination graph the processors number is 8

The value ofCCR

HEFT(averagemakespan)

DMSCRO(averagemakespan)

TMSCRO(averagemakespan)

DRSCRO(averagemakespan)

DRSCRO(best

makespan)

DRSCRO(worst

makespan)

DRSCRO(variance)

01 96612 94632 94124 93935 92010 96212 200102 96952 94965 94455 94266 92333 96551 20081 131094 128407 127717 127462 124849 130552 27152 222635 218071 216900 216467 214194 219550 24565 403600 395327 393204 392418 390260 394083 2069

scenario range and metaheuristic algorithm performs moreeffectively for communication-intensive DAGs

532 Randomly Generated Application Graphs As shownin Figures 14ndash16 randomly generated DAGs are used toevaluate the performance of DRSCRO TMSCRO DMSCROand HEFT in these experiments And the details of theseexperimental results are listed in Tables 9ndash11

Figure 14 shows the performance on the experimentalresults of these four algorithms with the processor numberincreasing As shown in Figure 14 DRSCRO always out-performs TMSCRO DMSCRO and HEFT as the numberof processors increases Figure 15 shows that DMSCRO has

better performance than the other three algorithms as thetask number increases The reasons for these are similarto those explained in the third paragraph of Section 531Figure 16 shows the makespan on average with CCR valuesincreasing It can be seem that the averagemakespan increasesrapidly with the increasing of the value of CCR As shown inFigure 16 the makespan on average increases rapidly whenthe value of CCR rises It is the fact that the DAG becomesmore communication-intensive with CCR increasing whichleads to the processors staying in the idle state for longer

54 Convergence Tests In this section the convergenceexperiments are conducted to show the change of makespan

16 Mathematical Problems in Engineering

Table 9 Experiment results for random graphs under different processor numbers task number is 50 and CCR = 02

The number ofprocessors

HEFT(averagemakespan)

DMSCRO(averagemakespan)

TMSCRO(averagemakespan)

DRSCRO(averagemakespan)

DRSCRO(best

makespan)

DRSCRO(worst

makespan)

DRSCRO(variance)

4 149400 146337 145552 145261 142283 148782 30948 119511 117061 116433 116200 113818 119017 247516 119473 117024 116396 116163 113782 118979 247432 119468 115550 114929 114700 112348 117480 2443

Table 10 Experiment results for random graphs under different task numbers processors number is 32 and CCR = 10

The numberof tasks

HEFT(averagemakespan)

DMSCRO(averagemakespan)

TMSCRO(averagemakespan)

DRSCRO(averagemakespan)

DRSCRO(best

makespan)

DRSCRO(worst

makespan)

DRSCRO(variance)

10 714484 706149 699100 690708 688982 693638 202520 1100918 1089685 1080428 1069082 1066410 1071479 261950 1666373 1662543 1649988 1634230 1633415 1636261 1165

Table 11 Experiment results of DRSCRO for the random graph under different CCRs the task number is 50

Value of CCR Processors number is 4 Processors number is 8 Processors number is 16 Processors number is 3201 145093 116068 116058 11459202 145261 116200 116163 1147001 153248 120228 119511 1185342 194401 166014 164246 1622925 508759 498427 492049 487830

0

50

100

150

CCR = 02

Mak

espa

n

HEFTDMSCRO

TMSCRODRSCRO

|P| = 4 |P| = 8 |P| = 16 |P| = 32

Figure 14 Average makespan for random graphs under differentprocessor numbers task number is 50 and CCR = 02

among DRSCRO TMSCRO and DMSCRO The conver-gence traces and significant tests are to further reveal thedifferences between DRSCRO and the other two algorithmsIn these experiments as suggested in [27] the stoppingcriteria of these three algorithms are that the total runningtime reaches a setting value (eg 180 s) Under the consid-eration of comparability the beginning of the time countingof DRSCRO is set as the start of the global optimizationphase processing In the first extensive test bed themakespanis averaged over 10 independent runs while in the secondextensive test bed the makespan is averaged over 30 differentrandom graph running instances

10 20 500

200400600800

10001200140016001800

The number of tasks

Mak

espa

n

HEFTDMSCRO

TMSCRODRSCRO

Figure 15 Average makespan for random graphs under differenttask numbers processors number is 32 and CCR = 10

541 Convergence Trace The convergence traces ofDRSCRO TMSCRO and DMSCRO for processing themolecular dynamics code and Gaussian elimination areplotted in Figures 17 and 18 respectively Figures 19ndash21show the convergence traces when processing the randomlygenerated DAG sets of which each contains 10 20 and50 tasks respectively As shown in Figures 17ndash21 it canbe observed that the convergence traces of these threealgorithms have obvious differences And the DRSCROconverges faster than the other two algorithms in every caseThe reason for the better rate of convergence of DRSCRO is as

Mathematical Problems in Engineering 17

01 02 1 2 50

100

200

300

400

500

600

CCR

Mak

espa

n

P = 32

P = 16P = 8P = 4

Figure 16 Average makespan of DRSCRO for the random graphunder different CCRs the task number is 50

0 1 2 3 4 5 6 7 8 9134

135

136

137

138

139

140

141

Time (Ms)

Mak

espa

n

DMSCROTMSCRO

DRSCRO

times104

Figure 17 Convergence trace for the molecular dynamics codeCCR = 1 and the number of processors is 16

Time (Ms)0 1 2 3 4 5 6 7 8 9

167

168

169

170

171

172

173

174

Mak

espa

n

DMSCROTMSCRO

DRSCRO

times104

Figure 18 Convergence trace for Gaussian elimination CCR = 02and the number of processors is 8

presented in the last paragraph of Section 46 (ie DRSCROtakes the advantage of its double-reaction structure toobtain a better super molecule for accelerating convergence)Even though the VNS algorithm adds the time cost in eachiteration the enhanced optimization capability of DRSCROalso makes it obtain a better coverage rate than TMSCRO

Time (Ms)0 1 2 3 4 5 6 7

505

51

515

52

525

53

535

54

Mak

espa

n

DMSCROTMSCRO

DRSCRO

times104

Figure 19 Convergence trace for the set of the randomly generatedDAGs with 10 tasks

Time (Ms)0 2 4 6 8 10 12 14

65566

66567

67568

68569

69570

705

Mak

espa

n

DMSCROTMSCRO

DRSCRO

times104

Figure 20 Convergence trace for the set of the randomly generatedDAGs with 20 tasks

Time (Ms)0 05 1 15 2 25 3 35 4 45

179

180

181

182

183

184

185

186

Mak

espa

n

DMSCROTMSCRO

DRSCRO

times105

Figure 21 Convergence trace for the set of the randomly generatedDAGs with 50 tasks

and DMSCRO The simulation experimental results showthat DRSCRO converges faster than TMSCRO by 194 onaverage (by 293 in the best case) and faster than DMSCROby 339 on average (by 412 in the best case)

Moreover the statistical analysis based on the averagevalues achieved is also presented in Section 542 to prove

18 Mathematical Problems in Engineering

Table 12 Results of Friedman tests 120572 = 005

Method Algorithm 119901 value Hypothesis

Friedman testDRSCRO DMSCRO 253119864 minus 02 RejectedDRSCRO TMSCRO 253119864 minus 02 Rejected

DRSCRO DMSCRO TMSCRO 670119864 minus 03 Rejected

Table 13 Results of Quade tests 120572 = 005

Method Algorithm 119901 value Hypothesis

Quade testDRSCRO DMSCRO 132119864 minus 02 RejectedDRSCRO TMSCRO 132119864 minus 02 Rejected

DRSCRO DMSCRO TMSCRO 109119864 minus 03 Rejected

that DRSCRO outperforms the other CRO-based algorithmsfor DAG scheduling from a statistical point of view

542 Significant Tests Statistical analysis is necessary for theaverage coverage rates obtained in all cases by DRSCROTMSCRO and DMSCRO which are metaheuristic methodsin order to find significant differences among these resultsNonparametric tests according to the recommendationsin [40] are specifically considered to be used since theexperimental results may present neither normal distributionnor variance homogeneity Therefore the Friedman test andthe Quade test are applied to check whether significantdifferences exist in the performance between these threealgorithms A significance level 120572 = 005 is used in allstatistical tests

Tables 12 and 13 respectively list the test results of theFriedman test and the Quade test which both reject the nullhypothesis of equivalent performance In both of these twotests our proposedDRSCRO is not only compared against allthe algorithms but also compared against the remaining onesas the control methodThe results in Tables 12 and 13 validatethe significant differences 120572 in the performance of DRSCROTMSCRO and DMSCRO

In sum it could be concluded that DRSCRO which is thecontrol algorithm statistically outperforms the other CRO-based DAG scheduling algorithm on coverage rate with asignificant level of 005

6 Discussion

The experimental results of makespan tests show that theperformance of DRSCRO is very similar to the other similarkinds of metaheuristic algorithms because when averagedover all possible fitness functions each well-designed meta-heuristic algorithm has the same performance for searchingoptimal solutions according to No-Free-Lunch TheoremHowever the proposed DRSCRO can achieve better perfor-mance and find good solutions faster than the other similarkinds of metaheuristic algorithms as the experimental resultsof convergence tests and the reason for it as the analysis inthe last paragraph in Section 46 is that DRSCROhas a bettersuper molecule creation bymetaheuristic method and underthe consideration of the optimization of scheduling orderand processor assignment DRSCRO takes the advantages of

VNS algorithm in the global optimization phase to improvethe optimization capability A load balance neighborhoodstructure is also applied in the ineffective reaction operatorfor a better intensification capability The new processorselection model utilized in the neighborhood structures alsopromotes the efficiency of VNS algorithm

7 Conclusion and Future Study

An algorithm named Double-Reaction-Structured CRO(DRSCRO) is developed for DAG scheduling on heteroge-neous systems in this paper DRSCRO includes two reactionphases one for super molecule selection and another forglobal optimizationThe phase of super molecule selection isused to obtain a super molecule by the metaheuristic methodfor better convergence rate different from other CRO-basedalgorithms forDAG scheduling on heterogeneous systems Inaddition to promote the intersection capability of DRSCROthe VNS algorithm which is with a new model for processorselection utilized in the neighborhood structures is used asthe initialization of global optimization phase and the loadbalance neighborhood structure of VNS is also applied inthe ineffective reaction operator The experimental resultsshow that DRSCRO can also achieve a higher speedup thanthe other CRO-based algorithms as far as we know AndDRSCRO algorithm can also obtain better performance onaveragemakespan in some cases

In future work we will analyze the parameter sensitivityof DRSCRO for promoting its activeness Moreover to makethe proposed algorithmmore practical DRSCROwill be alsoextended to aim at twomain objectives such as (1) minimiza-tion of schedule length (time domain) and (2) minimizationof number of used processors (resource domain)

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

This work is financially supported by the National NaturalScience Foundation of China (Grant no 61462073) and

Mathematical Problems in Engineering 19

the National High Technology Research and DevelopmentProgramof China (863 Program) (Grant no 2015AA020107)

References

[1] T Lewis and H Elrewini Introduction to Parallel ComputingPrentice-Hall New York NY USA 1992

[2] Y-K Kwok and I Ahmad ldquoStatic scheduling algorithms forallocating directed task graphs to multiprocessorsrdquo ACM Com-puting Surveys vol 31 no 4 pp 406ndash471 1999

[3] H Topcuoglu S Hariri and M-Y Wu ldquoPerformance-effectiveand low-complexity task scheduling for heterogeneous comput-ingrdquo IEEE Transactions on Parallel and Distributed Systems vol13 no 3 pp 260ndash274 2002

[4] Y-K Kwok and I Ahmad ldquoDynamic critical-path schedulingan effective technique for allocating task graphs to multiproces-sorsrdquo IEEETransactions on Parallel andDistributed Systems vol7 no 5 pp 506ndash521 1996

[5] F Suter F Desprez and H Casanova ldquoFrom heterogeneoustask scheduling to heterogeneous mixed parallel schedulingrdquo inEuro-Par 2004 Parallel Processing 10th International Euro-ParConference Pisa Italy August 31- September 3 2004 Proceed-ings vol 3149 of Lecture Notes in Computer Science pp 230ndash237Springer Berlin Germany 2004

[6] C-Y Lee J J Hwang Y-C Chow and F D Anger ldquoMultipro-cessor scheduling with interprocessor communication delaysrdquoOperations Research Letters vol 7 no 3 pp 141ndash147 1988

[7] M Maheswaran and H J Siegel ldquoA dynamic matching andscheduling algorithm for heterogeneous computing systemsrdquo inProceedings of the 7th Heterogeneous Computing Workshop pp57ndash69 IEEE Computer Society Orlando Fla USAMarch 1998

[8] M-Y Wu and D D Gajski ldquoHypertool a programming aidformessage-passing systemsrdquo IEEE Transactions on Parallel andDistributed Systems vol 1 no 3 pp 330ndash343 1990

[9] H El-Rewini and T G Lewis ldquoScheduling parallel programtasks onto arbitrary target machinesrdquo Journal of Parallel andDistributed Computing vol 9 no 2 pp 138ndash153 1990

[10] G C Sih and E A Lee ldquoCompile-time scheduling heuristic forinterconnection-constrained heterogeneous processor archi-tecturesrdquo IEEE Transactions on Parallel andDistributed Systemsvol 4 no 2 pp 175ndash187 1993

[11] B Kruatrachue and T Lewis ldquoGrain size determination forparallel processingrdquo IEEE Software vol 5 no 1 pp 23ndash32 1988

[12] M A Al-Mouhamed ldquoLower bound on the number of proces-sors and time for scheduling precedence graphs with communi-cation costsrdquo IEEETransactions on Software Engineering vol 16no 12 pp 1390ndash1401 1990

[13] A Tumeo C Pilato F Ferrandi D Sciuto and P L LanzildquoAnt colony optimization for mapping and scheduling in het-erogeneous multiprocessor systemsrdquo in Proceedings of the Inter-national Conference on Embedded Computer Systems Architec-turesModeling and Simulation (SAMOS rsquo08) pp 142ndash149 IEEEComputer Society Samos Greece June 2008

[14] I Ahmad and M K Dhodhi ldquoMultiprocessor scheduling in agenetic paradigmrdquo Parallel Computing vol 22 no 3 pp 395ndash406 1996

[15] M R Bonyadi and M E Moghaddam ldquoA bipartite geneticalgorithm for multi-processor task schedulingrdquo InternationalJournal of Parallel Programming vol 37 no 5 pp 462ndash487 2009

[16] R C Correa A Ferreira and P Rebreyend ldquoScheduling multi-processor tasks with genetic algorithmsrdquo IEEE Transactions onParallel andDistributed Systems vol 10 no 8 pp 825ndash837 1999

[17] E S H Hou N Ansari and H Ren ldquoGenetic algorithm formultiprocessor schedulingrdquo IEEE Transactions on Parallel andDistributed Systems vol 5 no 2 pp 113ndash120 1994

[18] F A Omara and M M Arafa ldquoGenetic algorithms for taskscheduling problemrdquo Journal of Parallel and Distributed Com-puting vol 70 no 1 pp 13ndash22 2010

[19] A Singh M Sevaux and A Rossi ldquoA hybrid grouping geneticalgorithm for multiprocessor schedulingrdquo in ContemporaryComputing Second International Conference IC3 2009 NoidaIndia August 17ndash19 2009 Proceedings vol 40 of Communica-tions in Computer and Information Science pp 1ndash7 SpringerBerlin Germany 2009

[20] A S Wu H Yu S Jin K-C Lin and G Schiavone ldquoAnincremental genetic algorithm approach to multiprocessorschedulingrdquo IEEE Transactions on Parallel and DistributedSystems vol 15 no 9 pp 824ndash834 2004

[21] Y Wen H Xu and J Yang ldquoA heuristic-based hybrid genetic-variable neighborhood search algorithm for task schedulingin heterogeneous multiprocessor systemrdquo Information Sciencesvol 181 no 3 pp 567ndash581 2011

[22] R Shanmugapriya S Padmavathi and S Shalinie ldquoContentionawareness in task scheduling using Tabu searchrdquo in Proceedingsof the IEEE International Advance Computing Conference pp272ndash277 IEEE Computer Society Patiala India March 2009

[23] S C Porto and C C Ribeiro ldquoA tabu search approach totask scheduling on heterogeneous processors under precedenceconstraintsrdquo International Journal of High Speed Computing vol7 no 1 pp 45ndash72 1995

[24] A V Kalashnikov and V A Kostenko ldquoA parallel algorithm ofsimulated annealing for multiprocessor schedulingrdquo Journal ofComputer and Systems Sciences International vol 47 no 3 pp455ndash463 2008

[25] A Y S Lam and V O K Li ldquoChemical-reaction-inspiredmeta-heuristic for optimizationrdquo IEEE Transactions on EvolutionaryComputation vol 14 no 3 pp 381ndash399 2010

[26] P Choudhury R Kumar and P P Chakrabarti ldquoHybridscheduling of dynamic task graphs with selective duplicationfor multiprocessors under memory and time constraintsrdquo IEEETransactions on Parallel and Distributed Systems vol 19 no 7pp 967ndash980 2008

[27] Y Xu K Li LHe andT K Truong ldquoADAG scheduling schemeon heterogeneous computing systems using double molecularstructure-based chemical reaction optimizationrdquo Journal ofParallel andDistributed Computing vol 73 no 9 pp 1306ndash13222013

[28] J Xu and A Lam ldquoChemical reaction optimization for the gridscheduling problemrdquo in Proceedings of the IEEE InternationalConference on Communications pp 1ndash5 IEEE Computer Soci-ety Cape Town South Africa May 2010

[29] B Varghese G McKee and V Alexandrov ldquoCan agent intel-ligence be used to achieve fault tolerant parallel computingsystemsrdquo Parallel Processing Letters vol 21 no 4 pp 379ndash3962011

[30] Y Jiang Z Shao and Y Guo ldquoA DAG scheduling scheme onheterogeneous computing systems using tuple-based chemicalreaction optimizationrdquo Scientific World Journal vol 2014 Arti-cle ID 404375 23 pages 2014

[31] J Xu AY S Lam and V O K Li ldquoStock portfolio selectionusing chemical reaction optimizationrdquo in Proceedings of theInternational Conference on Operations Research and FinancialEngineering pp 458ndash463 WASET Paris France June 2011

20 Mathematical Problems in Engineering

[32] N Mladenovic and P Hansen ldquoVariable neighborhood searchrdquoComputers amp Operations Research vol 24 no 11 pp 1097ndash11001997

[33] K Li X Tang and K Li ldquoEnergy-efficient stochastic taskscheduling on heterogeneous computing systemsrdquo IEEE Trans-actions on Parallel and Distributed Systems vol 25 no 11 pp2867ndash2876 2014

[34] D HWolpert andWGMacready ldquoNo free lunch theorems foroptimizationrdquo IEEE Transactions on Evolutionary Computationvol 1 no 1 pp 67ndash82 1997

[35] M A Khan ldquoScheduling for heterogeneous Systems usingconstrained critical pathsrdquo Parallel Computing vol 38 no 4-5pp 175ndash193 2012

[36] P Hansen N Mladenovic and J M Perez ldquoVariable neigh-borhood search methods and applicationsrdquo 4OR-A QuarterlyJournal of Operations Research vol 6 no 4 pp 319ndash360 2008

[37] F Glover and G Kochenberger Handbook of MetaheuristicsSpringer New York NY USA 2003

[38] S Kim and J Browne ldquoA general approach to mapping of par-allel computation upon multiprocessor architecturesrdquo in Pro-ceedings of the International Conference on Parallel Processingpp 1ndash8 Pennsylvania State University Pennsylvania State Uni-versity Press University Park Pa USA August 1988

[39] V A F Almeida I M M Vasconcelos J N C Arabe and D AMenasce ldquoUsing random task graphs to investigate the potentialbenefits of heterogeneity in parallel systemsrdquo in Proceedingsof the ACMIEEE Conference on Supercomputing pp 683ndash691IEEE Computer Society Minneapolis Minn USA November1992

[40] S Garcıa A Fernandez J Luengo and F Herrera ldquoAdvancednonparametric tests for multiple comparisons in the design ofexperiments in computational intelligence and data miningexperimental analysis of powerrdquo Information Sciences vol 180no 10 pp 2044ndash2064 2010

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 9: Research Article DRSCRO: A Metaheuristic Algorithm for ...downloads.hindawi.com/journals/mpe/2015/396582.pdf · Research Article DRSCRO: A Metaheuristic Algorithm for Task Scheduling

Mathematical Problems in Engineering 9

(1) tempSet = pop set(2) pop subset = 0(3) if pop set is the input of the VNS algorithm for the first time (ie the output of the super molecule selection phase)(4) for each tempS in tempSet except SMole(5) choose a tuple (V

119894 119891119894 119901119894) in tempS where 119891

119894= 0 randomly

(6) generate a random number 119903119899119889 isin (0 1)(7) if rndge 05(8) find the first predecessor vj = Pred(vi) from vi to the begin in molecule tempS(9) interchanged position of (vi f i pi) and (V

119895+1 119891119895+1

119901119895+1

) in molecule tempS(10) update f i 119891119894+1 and 119891

119895+1as defined in the last paragraph of Section 421

(11) end if(12) for each pi in molecule tempS to randomly change(13) change pi randomly(14) end for(15) if Fit(tempS) lt Fit(SMole)(16) SMole = tempS(17) end if(18) end for(19) end if(20) pop subset adds SMole(21) pop subset adds the molecules in tempSet with tuple order different from SMole(22) while |119901119900119901 119904119906119887119904119890119905| = 119901119900119901 119904119906119887119904119890119905 119899119906119898 do(23) pop subset add a molecules in pop set which do not exist in pop subset(24) end while

Algorithm 3 InitVNS(pop set pop subset num) initializing the subset of the population pop set for undergoing the VNS algorithm

As shown in Figure 5 the operator IntermoleSMS isused to generate new molecules m1015840

1and m1015840

2from given

molecules m1and m

2 This operator first uses the steps in

OnWallSMS to generatem10158401fromm

1 and then the operator

generates the other newmoleculem10158402fromm

2in the similar

fashion In the end the operator generates two newmoleculesm10158401andm1015840

2from m

1and m

2as an intensification search

As shown in Figure 6 the operator SynthSMS is usedto generate a new molecule m1015840 from given molecules m

1

and m2 SynthSMS works as follows The operator keeps the

tuples inm1015840 which is at the same position inm1andm

2with

the same 119901119909rsquos and then changes the remaining 119901

119910rsquos in m1015840

randomly As a result the operator generatesm1015840 fromm1and

m2as a diversification search

44 Global Optimization Phase

441 Initialization VNS is utilized by our proposed algo-rithm as the initialization of the global optimization phaseand it is also as a local search process to promote theintensification capability of DRSCRO during the running ofthe whole algorithm

Algorithms 3 and 4 respectively present the subset gener-ator of the phase output and the main steps of the whole VNSalgorithm (ie the initialization of the global optimizationphase) In DRSCRO the VNS algorithm only processes thesubset of the population with the super molecule SMoleafter each iteration in the global optimization phase (theoutput of super molecule selection phase is the input of VNSfor the first time) As presented in Algorithm 3 if the pop set(ie the set of population) is the output of the super molecule

selection phase the tuple orders and 119901119894s of its elements

will be adjusted pop subset is the subset of population andpop subset num is the number of the elements in pop subsetwhich is set as PopSize times 50 in this paper

In Algorithm 4 different from the VNS proposed in [21]the task priority was changeable in theVNS algorithmused inDRSCRO the reason for which is that the unchangeable taskpriority in the VNS reduces its efficiency to obtain a bettersolutionTherefore under the consideration of the optimiza-tion of the scheduling order and the processor assignmentboth the input molecules of VNS can be with differenttuple order (ie task priority) as presented in Algorithm 3in each iteration 119889max is set to 2 as presented in [37] Asthe essential factor of VNS two neighborhood structuresload balance and communication reduction neighborhoodstructures which demonstrate their power in solving DAGscheduling problem on heterogeneous systems as presentedin [21] are adopted by the VNS algorithm in DRSCRO fortheir high efficiency In this paper a new model is alsoproposed for processor selection of these two neighborhoodstructures As presented in [21] there are two intuitionsused to construct the neighborhood structures One isthat balancing load among various processors usually helpsminimizing the makespan especially when most tasks areallocated to only a few processors the other is that reducingcommunication overhead and idle waiting time of processorsalways results in a more effective schedule especially givena relatively high unit communication However there is acontradiction between these two intuitions because reducingcommunication overhead and idle waiting time of processorsalways means that some processors are with most tasks

10 Mathematical Problems in Engineering

(1) pop subset = InitVNS(pop set)(2) Select the set of neighborhood structures 119873119890119894119892ℎ

119889(119889 = 1 2 3 119889max)

(3) for each individualm in the pop subset do(4) d = 1(5) while 119889 lt 119889max do(6) Randomly generate a moleculem

1from the 119889th neighborhood ofm

(7) Apply some local search method withm1as the initial molecule (the local optimum presented bym

2)

(8) If m2is better thanm

(9) m = m2

(10) 119889 = 1(11) else(12) d = d + 1(13) end if(14) end while(15) until the termination condition is satisfied(16) end for(17) execute the combination strategy

Algorithm 4 InitMoleGOVNS(pop set) generating the initial population of the global optimization phase

(1) for each processor 119901119894in the solution 120596 do

(2) Compute Tend(pi)(3) end for(4) Choose the processor 119901max with the largest Tend(119901max)(5) Randomly choose a task Vrandom from exec(119901max)(6) Randomly choose a processor 119901random different from 119901max(7) Reallocate Vrandom to the processor 119901random(8) Encode and reschedule the changed solution 1205961015840(9) return 1205961015840

Algorithm 5 GenNeighborhoodofLoadBalance(120596)

So different from the original ones in [21] we develop anew model for processor selection Let TCload(119901

119894) be all

the task execution cost of processor 119901119894 and TCcomm(119901

119894)

is the communication cost overhead of processor 119901119894as

defined in [21] The values of TCload(119901119894) and TCcomm(119901

119894)

are the tendencies of load balancing and communicationreducing respectively (ie the tendency of task reducing orincreasing) The greater TCload(119901

119894) is the stronger tendency

of reducing tasks on 119901119894is and the greater TCcomm(119901

119894) is the

stronger tendency of increasing tasks on 119901119894is Therefore a

parameter Tend(119901119894) is developed to measure the tendency

with the combination of TCload(119901119894) and TCcomm(119901

119894) as (11)

The neighborhood structure computation processes of loadbalance and communication reduction are as presented inAlgorithms 5 and 6 respectively The proposed model isunder the comprehensive consideration of both intuitionsand can make the VNS algorithm more effective than theoriginal one

Tend (119901119894) =

1

1 + 119890minus(TCload(119901119894)TCcomm(119901119894)) (11)

The VNS algorithm in DRSCRO utilizes the dual termi-nation criteria The termination criterion 1 sets the upperbound of the local search iterations to 20 and the termination

criterion 2 sets the maximum iteration number withoutimprovement to 3 The VNS algorithm will stop if eithercriterion is satisfied To form a new initial population acombination strategy is utilized for combining the currentpopulation and the VNS output after the VNS algorithmoutputs the subset of the population The current populationand the VNS output are first merged and sorted by increasingmakespan then the first PopSize molecules are selected togenerate the new initial population

442 Elementary Chemical Reaction Operators The opera-tors for global optimization not only vary 119901

119894of each tuple

but also interchange the positions of the tuples in a moleculeas the intensification searches or the diversification searches[25] to optimize the whole solution

On-wall ineffective collision (as an intensificationsearch) decomposition (as a diversification search) andsynthesis (as a diversification search) are as presented in [30]and we do not repeat them here to focus on our main workIn [30] the function of the ineffective collision operator issimilar to that of the on-wall ineffective collision operatorTherefore different from [30] amodified ineffective collisionoperator is proposed in this paper and it utilized the loadbalance neighborhood structure used in the VNS mentioned

Mathematical Problems in Engineering 11

(1) for each processor pi in the solution 120596 do(2) Compute Tend(pi)(3) end for(4) Choose the processor 119901min with the smallest Tend(119901min)(5) Set the candidate set cand empty(6) for each task vi in the set exec(119901min) do(7) Compute the set predecessor(vi)(8) Update predecessor(vi) with predecessor(vi) = predecessor(vi) minus exec(119901min)(9) cand = cand + predecessor(vi)(10) end for(11) Randomly choose a task Vrandom from cand(12) Reallocate Vrandom to the processor 119901min(13) Encode and reschedule the changed solution 1205961015840(14) return 1205961015840

Algorithm 6 GenNeighborhoodofCommReduction(120596)

(1) choose randomly a tuple (vi f i pi) inm1where f i = 0

(2) exchange the positions of (vi f i pi) and (V119894minus1119891119894minus1 119901119894minus1)

(3) modify 119891119894minus1 f i and 119891

119894+1inm1as defined in the last paragraph of Section 421

(4) generate a new moleculem10158401= GenLBNeighborhood(m

1)

(5) choose randomly a tuple (vi f i pi) inm2where f j = 0

(6) exchange the positions of (vi f i pi) and (V119895minus1

119891119895minus1

119901119895minus1

)(7) modify 119891

119895minus1 f j and 119891

119895+1inm2as defined in the last paragraph of Section 421

(8) generate a new moleculem10158402= GenLBNeighborhood(m

2)

Algorithm 7 IntermoleGO(m1m2)

Table 3 Execution cost of DAG tasks by each processor

Tasks 1199011

1199012

1199013

v1

7 8 9v2

12 14 16v3

14 15 16v4

13 3 14

before to promote the intensification capability of DRSCROand avoid the function duplication

The operator IntermoleGO (ie the ineffective collisionoperator) is used to generate new molecules m1015840

1and m1015840

2

from given moleculesm1andm

2 This operator first uses the

steps in OnWallGO to generate m10158401from m

1 and then the

operator generate the other newmoleculem10158402fromm

2in the

similar fashion In the end the operator generates two newmoleculesm1015840

1andm1015840

2fromm

1andm

2as an intensification

searchThe detailed executions are presented in Algorithm 7Figure 7 shows the example of the IntermoleGO in which themolecules correspond to the DAG as shown in Figure 1(a)

45 Illustrative Example Consider the example shown inFigure 1(a) Its edges are labeled with the communicationcosts whereas the execution costs are shown in Table 3

Initially the path (V1 V2 V4 V3) is found based on

the upward rank value of each task in DAG and the firstmolecule m = ((V

1 0 119901

1) (V2 1 1199011) (V4 0 119901

1) and (V

3

1 1199011)) can be obtained Algorithm 2 InitMoleSMS is then

executed to generate the initial population with 10 elements(ie PopSize is set as 10) for super molecule selection phaseThe initial population molecules are operated during theiterations in the super molecule selection phase as presentedin the framework of DRSCRO in Section 41 and the supermolecule SMole = ((V

1 0 1199013) (V2 1 1199011) (V4 0 1199012) and (V

3

1 1199011)) can be obtainedIn the global optimization phase Algorithm 4 InitMole-

GOVNS is then executed to generate (or to update) the initialpopulation after each iteration as presented in Section 41The molecules are operated during the iterations in theglobal optimization phase as presented in the framework ofDRSCRO in Section 41 and the global minimalmakespan =40 is finally obtained for which the corresponding solution(ie molecule) is ((V

1 0 1199012) (V4 1 1199012) (V2 0 1199012) and (V

3 1

1199012))

46 Analysis of DRSCRO As a new metaheuristic strategythe CRO-based methods for DAG scheduling which isproposed very recently have demonstrated the capability forsolving this kind of NP-hard optimization problems By ana-lyzing the framework molecular structure chemical reactionoperators and the operational environment in DRSCRO itcan be shown to some extent that DRSCRO scheme has theadvantage of three points in comparison with other CRO-based algorithms for DAG scheduling

12 Mathematical Problems in Engineering

New molecule 1

New molecule 2

Molecule 1

Molecule 2

(1 0 p2) (4 1 p2) (2 0 p3) (3 0 p2)

(3 0 p2)(4 1 p2)(2 1 p2)

(1 0 p1)

(1 0 p2)

(1 0 p1) (4 1 p1) (2 0 p1)

(4 0 p2)(2 1 p2)

(3 1 p1)

(3 0 p1)

Figure 7 Example of IntermoleGO

First to some degree super molecule in DRSCRO issimilar to InitS in TMSCRO [30] or the ldquoeliterdquo in GA[31] However the ldquoeliterdquo in GA is usually generated fromtwo chromosomes while super molecule is approached byexecuting the first phase of DRSCRO Moreover in compari-son with TMSCRO DRSCRO uses a metaheuristic strategy(CRO) to get a better super molecule It is because asintelligent random-search algorithm CRO used in the phaseof DRSCRO for super molecule selection searches a widerarea of the solution space than CEFT applied in TMSCROwhich narrow the search down to a very small portion ofthe solution space As a result a better super molecule maycontribute to a better global optimum solution and accel-erates convergence Second DAG scheduling problem hastwo complex aspects including task sequence optimizationand processor assignment optimization which lead to avery large and complicated solution space So for a bettercapability of intensification search than other CRO-basedalgorithms for DAG scheduling on heterogeneous systemsDRSCRO applied VNS algorithm as the initialization of theglobal optimization phase which is also as a local searchprocess during the running of DRSCRO and one of theneighborhood structures of VNS is also utilized in theineffective reaction operator Moreover during the runningof DRSCRO the task priority is changeable in our adoptedVNS algorithm and a new model for processor selection isalso utilized in the neighborhood structures for promotingefficiency of VNS different from the VNS proposed in [22]All of three advantages as previously mentioned enhance theability to get better rapidity of convergence and better searchresult in the whole solution space which is demonstrated bythe experimental results in Sections 53 and 54 The time

complexity of DRSCRO is 119874(iter times (2 times |119881| + 4 times |119864| times |119875| +

119889max times 119904119906119887119901119900119901119873119906119898))

5 Experimental Details

In this section the simulation experiment and comparativeevaluation of HEFT DMSCRO TMSCRO and proposedDRSCRO are presented As presented in [27 30] by theoryanalysis and experimental results TMSCRO and DMSCROproved to have better performance than GA therefore ourwork is the further study of CRO-based algorithms for DAGscheduling on heterogeneous systems and for DRSCRO asa metaheuristic algorithm we focus on the performance ofour proposed algorithm itself and the comparison betweenDRSCRO and other similar kinds of algorithms

First two extensive sets of graphs as the test beds forcomparative study are described Next the parameter settingswhich are used in the simulation experiments are pre-sented The results and analysis of the experiment includingmakespan test and convergence rate test are given in the finalpart

51 Test Bed As presented in [27 30] two extensive setsof DAGs real-world application and randomly generatedapplication graphs are considered as the test beds in theexperiments to enhance the comparability of various algo-rithms The first extensive test bed is two real-world problemDAGs molecular dynamics code [38] and Gaussian elimi-nation [8] Molecular dynamics are a computer simulationof physical movements of the molecules and atoms whichare allowed to interact for a period of time in the contextof N-body simulation Molecular dynamics code DAG isshown in Figure 8 Gaussian elimination is used to calculatethe solution for a linear equation system which is appliedsystematically to convert row operations on a set of linearequations to the upper triangular form As shown in Figure 9the total number of tasks in the Gaussian elimination DAGwith the matrix size of 7 is 27 and the largest task numberat the same level is 6 The reason of the utilization of thesetwo application graphs as a test bed is not only to enhancethe comparability of various algorithms but also to showthe function application of our proposed algorithm as anillustrative demonstration without loss of generality Thesecond extensive test bed for comparative study is the DAGsof random graphs A random graph generator presentedin [39] is implemented to generate random graphs in thesimulation experiment It allows the user to generate a varietyof random graphs with different characteristics such as CCRthe amount of calculation of a task the successor numberof a task and the total number of tasks in a random graphIt is also assumed that all tasks and communication linkshave the same computation cost and communication costrespectively

As shown in Figure 2 the next phase criteria and stoppingcriteria of DRSCRO are that the makespan stays unchangedfor 5000 consecutive iterations in the search loop And thestopping criterion of TMSCRO and DMSCRO is that themakespan remains the same for 10000 consecutive iterations

Mathematical Problems in Engineering 13

Entry

Exit

40

38

36

39

37

3433

27 28

2120

35

31 323029

22 23 24 25 26

321

4 5 6 7

131211 14

8 9 10

15 16 17 18 19

0

Figure 8 A molecular dynamics code DAG

52 Parameter Setting In the experiments a parameter hlis set to represent the heterogeneity level as presented inthe first paragraph of Section 3 It complies with the MHMmodel assumption and results in the fact that speeds of acomputing processor are different for different tasks In doingso the heterogeneity level (1 + hl)(1 minus hl) is equal to thebiggest possible ratio of the best processor speed to the worstprocessor speed for each task hl is set as the value tomake theheterogeneity level 2 unless otherwise specified in this paper

The details of parameter setting are shown in Table 4Theparameters 6ndash12 which are the CRO-based algorithms testedin the simulation are set as presented in [25]

53 Makespan Tests The performance of the proposed algo-rithm is compared with two state-of-the-art CRO-basedscheduling algorithms DMSCRO and TMSCRO and aheuristic algorithm HEFT Each makespan value plotted inthe graphs is the average value of a number of independentruns In the first extensive test bed themakespan is averagedover 10 independent runs (HEFT is run only once as adeterministic algorithm) while in the second extensive test

Exit

0

1 2 3 4 5 6

7

8

13

14

18

19

22

23

25

26

24

20 21

15 16 17

9 10 11 12

Entry

Figure 9 A Gaussian elimination DAG for a matrix of size 7

Table 4 Parameter values for simulation experiment

Parameter ValueCCR 01 02 1 2 5Number of processors 4 8 16 32

hl 0333Successor number of a task in a random graph 1 2 3 4

Total number of tasks in a random graph 10 20 50

InitialKE 1000120579 500120599 10Buffer 200KELossRate 02MoleColl 02PopSize 10

bed the makespan is averaged over 30 different randomgraph running instances Moreover to prove the robustness

14 Mathematical Problems in Engineering

0

20

40

60

80

100

120

140

CCR = 10

Mak

espa

n

HEFTDMSCRO

TMSCRODRSCRO

|P| = 4 |P| = 8 |P| = 16 |P| = 32

Figure 10 Average makespan for the molecular dynamics codeCCR = 10

0

20

40

60

80

100

120

140

CCR = 02

Mak

espa

n

HEFTDMSCRO

TMSCRODRSCRO

|P| = 4 |P| = 8 |P| = 16 |P| = 32

Figure 11 Average makespan for Gaussian elimination CCR = 02

of DRSCRO the best final value achieved in all these runsthe worst final value and the related standard deviation orvariance are also presented

531 Real-World Application Graphs Figures 10ndash13 showthe simulation experiment results of DRSCRO DMSCROTMSCRO and HEFT on the real-world application graphsand Tables 4ndash7 list the detail of the experimental results

As shown in Figures 10 and 11 it can be observed thatthe average makespan decreases as the processor numberincreases The results also show that DRSCRO TMSCROand DMSCRO achieve very similar performance which areall metaheuristic methods It is because according to No-Free-Lunch Theorem all well-designed metaheuristic meth-ods have the same performance on searching for optimalsolutions when averaged over all possible fitness functionsThe TMSCRO andDMSCROused in the simulation are well-designed and taken from the literature Therefore it provedthat DRSCRO developed in our work is also well-designed

A close observation of the results in Tables 10 and 11shows that DRSCRO outperforms TMSCRO and DMSCROon average slightly The reason is only because DRSCRO hasbetter capability of intensification search by applying VNS

01 02 1 2 50

50100150200250300350400450

CCR

Mak

espa

n

HEFTDMSCRO

TMSCRODRSCRO

Figure 12 Average makespan for the molecular dynamics code theprocessors number is 16

01 02 1 2 50

50100150200250300350400450

CCR

Mak

espa

n

HEFTDMSCRO

TMSCRODRSCRO

Figure 13 Average makespan for Gaussian elimination the proces-sors number is 8

and the utilization of one of its neighborhood structuresin the ineffective reaction operator as presented in the lastparagraph of Section 46 Therefore the performance of theaverage results obtained by DRSCRO is better than thatobtained by TMSCRO and DMSCRO when the stoppingcriterion is satisfied Moreover DRSCRO TMSCRO andDMSCRO typically outperform HEFT because they search awider area of the solution space as metaheuristic methodswhile the search of HEFT is narrowed down to a very smallerportion by means of the heuristics

Figures 12 and 13 and the results in Tables 7 and 8show the performance of the experimental results of thesefour algorithms with CCR value increasing It can be seenthat the makespan on average increases with the CCR valueincreasing It is because the heterogeneous processors are inthe idle state for longer as a result of the DAGs becomingmore communication-intensive It also can be observed thatDRSCRO TMSCRO and DMSCRO outperform HEFT andthe advantage becomes more significant with the value ofCCR increasing which suggest that heuristic algorithm likeHEFT has less consistent performance in a wide scheduling

Mathematical Problems in Engineering 15

Table 5 Experiment results for the molecular dynamics code CCR = 10

The number ofprocessors

HEFT(averagemakespan)

DMSCRO(averagemakespan)

TMSCRO(averagemakespan)

DRSCRO(averagemakespan)

DRSCRO(best

makespan)

DRSCRO(worst

makespan)

DRSCRO(variance)

4 120086 117624 116993 116759 114365 119589 24878 120074 116633 116007 115775 113402 118582 246616 120031 116101 115478 115247 112885 118041 245532 119993 115476 114856 114529 112182 117306 2440

Table 6 Experiment results for Gaussian elimination graph CCR = 02

The number ofprocessors

HEFT(averagemakespan)

DMSCRO(averagemakespan)

TMSCRO(averagemakespan)

DRSCRO(averagemakespan)

DRSCRO(best

makespan)

DRSCRO(worst

makespan)

DRSCRO(variance)

4 126852 124252 123585 123337 122042 126327 17698 96952 94965 94455 94266 92333 96551 200816 82356 80668 80235 80074 78433 82015 170632 75630 74080 73682 73535 72027 75317 1566

Table 7 Experiment results for the molecular dynamics code the processors number is 16

The value ofCCR

HEFT(averagemakespan)

DMSCRO(averagemakespan)

TMSCRO(averagemakespan)

DRSCRO(averagemakespan)

DRSCRO(best

makespan)

DRSCRO(worst

makespan)

DRSCRO(variance)

01 111782 109491 108903 108685 106457 111320 231502 112034 109737 109148 108930 106697 111571 23201 120031 116101 115478 115247 112885 118041 24552 160760 157465 156619 156306 153102 158532 30345 414581 406082 403902 403095 400878 404805 2125

Table 8 Experiment results for Gaussian elimination graph the processors number is 8

The value ofCCR

HEFT(averagemakespan)

DMSCRO(averagemakespan)

TMSCRO(averagemakespan)

DRSCRO(averagemakespan)

DRSCRO(best

makespan)

DRSCRO(worst

makespan)

DRSCRO(variance)

01 96612 94632 94124 93935 92010 96212 200102 96952 94965 94455 94266 92333 96551 20081 131094 128407 127717 127462 124849 130552 27152 222635 218071 216900 216467 214194 219550 24565 403600 395327 393204 392418 390260 394083 2069

scenario range and metaheuristic algorithm performs moreeffectively for communication-intensive DAGs

532 Randomly Generated Application Graphs As shownin Figures 14ndash16 randomly generated DAGs are used toevaluate the performance of DRSCRO TMSCRO DMSCROand HEFT in these experiments And the details of theseexperimental results are listed in Tables 9ndash11

Figure 14 shows the performance on the experimentalresults of these four algorithms with the processor numberincreasing As shown in Figure 14 DRSCRO always out-performs TMSCRO DMSCRO and HEFT as the numberof processors increases Figure 15 shows that DMSCRO has

better performance than the other three algorithms as thetask number increases The reasons for these are similarto those explained in the third paragraph of Section 531Figure 16 shows the makespan on average with CCR valuesincreasing It can be seem that the averagemakespan increasesrapidly with the increasing of the value of CCR As shown inFigure 16 the makespan on average increases rapidly whenthe value of CCR rises It is the fact that the DAG becomesmore communication-intensive with CCR increasing whichleads to the processors staying in the idle state for longer

54 Convergence Tests In this section the convergenceexperiments are conducted to show the change of makespan

16 Mathematical Problems in Engineering

Table 9 Experiment results for random graphs under different processor numbers task number is 50 and CCR = 02

The number ofprocessors

HEFT(averagemakespan)

DMSCRO(averagemakespan)

TMSCRO(averagemakespan)

DRSCRO(averagemakespan)

DRSCRO(best

makespan)

DRSCRO(worst

makespan)

DRSCRO(variance)

4 149400 146337 145552 145261 142283 148782 30948 119511 117061 116433 116200 113818 119017 247516 119473 117024 116396 116163 113782 118979 247432 119468 115550 114929 114700 112348 117480 2443

Table 10 Experiment results for random graphs under different task numbers processors number is 32 and CCR = 10

The numberof tasks

HEFT(averagemakespan)

DMSCRO(averagemakespan)

TMSCRO(averagemakespan)

DRSCRO(averagemakespan)

DRSCRO(best

makespan)

DRSCRO(worst

makespan)

DRSCRO(variance)

10 714484 706149 699100 690708 688982 693638 202520 1100918 1089685 1080428 1069082 1066410 1071479 261950 1666373 1662543 1649988 1634230 1633415 1636261 1165

Table 11 Experiment results of DRSCRO for the random graph under different CCRs the task number is 50

Value of CCR Processors number is 4 Processors number is 8 Processors number is 16 Processors number is 3201 145093 116068 116058 11459202 145261 116200 116163 1147001 153248 120228 119511 1185342 194401 166014 164246 1622925 508759 498427 492049 487830

0

50

100

150

CCR = 02

Mak

espa

n

HEFTDMSCRO

TMSCRODRSCRO

|P| = 4 |P| = 8 |P| = 16 |P| = 32

Figure 14 Average makespan for random graphs under differentprocessor numbers task number is 50 and CCR = 02

among DRSCRO TMSCRO and DMSCRO The conver-gence traces and significant tests are to further reveal thedifferences between DRSCRO and the other two algorithmsIn these experiments as suggested in [27] the stoppingcriteria of these three algorithms are that the total runningtime reaches a setting value (eg 180 s) Under the consid-eration of comparability the beginning of the time countingof DRSCRO is set as the start of the global optimizationphase processing In the first extensive test bed themakespanis averaged over 10 independent runs while in the secondextensive test bed the makespan is averaged over 30 differentrandom graph running instances

10 20 500

200400600800

10001200140016001800

The number of tasks

Mak

espa

n

HEFTDMSCRO

TMSCRODRSCRO

Figure 15 Average makespan for random graphs under differenttask numbers processors number is 32 and CCR = 10

541 Convergence Trace The convergence traces ofDRSCRO TMSCRO and DMSCRO for processing themolecular dynamics code and Gaussian elimination areplotted in Figures 17 and 18 respectively Figures 19ndash21show the convergence traces when processing the randomlygenerated DAG sets of which each contains 10 20 and50 tasks respectively As shown in Figures 17ndash21 it canbe observed that the convergence traces of these threealgorithms have obvious differences And the DRSCROconverges faster than the other two algorithms in every caseThe reason for the better rate of convergence of DRSCRO is as

Mathematical Problems in Engineering 17

01 02 1 2 50

100

200

300

400

500

600

CCR

Mak

espa

n

P = 32

P = 16P = 8P = 4

Figure 16 Average makespan of DRSCRO for the random graphunder different CCRs the task number is 50

0 1 2 3 4 5 6 7 8 9134

135

136

137

138

139

140

141

Time (Ms)

Mak

espa

n

DMSCROTMSCRO

DRSCRO

times104

Figure 17 Convergence trace for the molecular dynamics codeCCR = 1 and the number of processors is 16

Time (Ms)0 1 2 3 4 5 6 7 8 9

167

168

169

170

171

172

173

174

Mak

espa

n

DMSCROTMSCRO

DRSCRO

times104

Figure 18 Convergence trace for Gaussian elimination CCR = 02and the number of processors is 8

presented in the last paragraph of Section 46 (ie DRSCROtakes the advantage of its double-reaction structure toobtain a better super molecule for accelerating convergence)Even though the VNS algorithm adds the time cost in eachiteration the enhanced optimization capability of DRSCROalso makes it obtain a better coverage rate than TMSCRO

Time (Ms)0 1 2 3 4 5 6 7

505

51

515

52

525

53

535

54

Mak

espa

n

DMSCROTMSCRO

DRSCRO

times104

Figure 19 Convergence trace for the set of the randomly generatedDAGs with 10 tasks

Time (Ms)0 2 4 6 8 10 12 14

65566

66567

67568

68569

69570

705

Mak

espa

n

DMSCROTMSCRO

DRSCRO

times104

Figure 20 Convergence trace for the set of the randomly generatedDAGs with 20 tasks

Time (Ms)0 05 1 15 2 25 3 35 4 45

179

180

181

182

183

184

185

186

Mak

espa

n

DMSCROTMSCRO

DRSCRO

times105

Figure 21 Convergence trace for the set of the randomly generatedDAGs with 50 tasks

and DMSCRO The simulation experimental results showthat DRSCRO converges faster than TMSCRO by 194 onaverage (by 293 in the best case) and faster than DMSCROby 339 on average (by 412 in the best case)

Moreover the statistical analysis based on the averagevalues achieved is also presented in Section 542 to prove

18 Mathematical Problems in Engineering

Table 12 Results of Friedman tests 120572 = 005

Method Algorithm 119901 value Hypothesis

Friedman testDRSCRO DMSCRO 253119864 minus 02 RejectedDRSCRO TMSCRO 253119864 minus 02 Rejected

DRSCRO DMSCRO TMSCRO 670119864 minus 03 Rejected

Table 13 Results of Quade tests 120572 = 005

Method Algorithm 119901 value Hypothesis

Quade testDRSCRO DMSCRO 132119864 minus 02 RejectedDRSCRO TMSCRO 132119864 minus 02 Rejected

DRSCRO DMSCRO TMSCRO 109119864 minus 03 Rejected

that DRSCRO outperforms the other CRO-based algorithmsfor DAG scheduling from a statistical point of view

542 Significant Tests Statistical analysis is necessary for theaverage coverage rates obtained in all cases by DRSCROTMSCRO and DMSCRO which are metaheuristic methodsin order to find significant differences among these resultsNonparametric tests according to the recommendationsin [40] are specifically considered to be used since theexperimental results may present neither normal distributionnor variance homogeneity Therefore the Friedman test andthe Quade test are applied to check whether significantdifferences exist in the performance between these threealgorithms A significance level 120572 = 005 is used in allstatistical tests

Tables 12 and 13 respectively list the test results of theFriedman test and the Quade test which both reject the nullhypothesis of equivalent performance In both of these twotests our proposedDRSCRO is not only compared against allthe algorithms but also compared against the remaining onesas the control methodThe results in Tables 12 and 13 validatethe significant differences 120572 in the performance of DRSCROTMSCRO and DMSCRO

In sum it could be concluded that DRSCRO which is thecontrol algorithm statistically outperforms the other CRO-based DAG scheduling algorithm on coverage rate with asignificant level of 005

6 Discussion

The experimental results of makespan tests show that theperformance of DRSCRO is very similar to the other similarkinds of metaheuristic algorithms because when averagedover all possible fitness functions each well-designed meta-heuristic algorithm has the same performance for searchingoptimal solutions according to No-Free-Lunch TheoremHowever the proposed DRSCRO can achieve better perfor-mance and find good solutions faster than the other similarkinds of metaheuristic algorithms as the experimental resultsof convergence tests and the reason for it as the analysis inthe last paragraph in Section 46 is that DRSCROhas a bettersuper molecule creation bymetaheuristic method and underthe consideration of the optimization of scheduling orderand processor assignment DRSCRO takes the advantages of

VNS algorithm in the global optimization phase to improvethe optimization capability A load balance neighborhoodstructure is also applied in the ineffective reaction operatorfor a better intensification capability The new processorselection model utilized in the neighborhood structures alsopromotes the efficiency of VNS algorithm

7 Conclusion and Future Study

An algorithm named Double-Reaction-Structured CRO(DRSCRO) is developed for DAG scheduling on heteroge-neous systems in this paper DRSCRO includes two reactionphases one for super molecule selection and another forglobal optimizationThe phase of super molecule selection isused to obtain a super molecule by the metaheuristic methodfor better convergence rate different from other CRO-basedalgorithms forDAG scheduling on heterogeneous systems Inaddition to promote the intersection capability of DRSCROthe VNS algorithm which is with a new model for processorselection utilized in the neighborhood structures is used asthe initialization of global optimization phase and the loadbalance neighborhood structure of VNS is also applied inthe ineffective reaction operator The experimental resultsshow that DRSCRO can also achieve a higher speedup thanthe other CRO-based algorithms as far as we know AndDRSCRO algorithm can also obtain better performance onaveragemakespan in some cases

In future work we will analyze the parameter sensitivityof DRSCRO for promoting its activeness Moreover to makethe proposed algorithmmore practical DRSCROwill be alsoextended to aim at twomain objectives such as (1) minimiza-tion of schedule length (time domain) and (2) minimizationof number of used processors (resource domain)

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

This work is financially supported by the National NaturalScience Foundation of China (Grant no 61462073) and

Mathematical Problems in Engineering 19

the National High Technology Research and DevelopmentProgramof China (863 Program) (Grant no 2015AA020107)

References

[1] T Lewis and H Elrewini Introduction to Parallel ComputingPrentice-Hall New York NY USA 1992

[2] Y-K Kwok and I Ahmad ldquoStatic scheduling algorithms forallocating directed task graphs to multiprocessorsrdquo ACM Com-puting Surveys vol 31 no 4 pp 406ndash471 1999

[3] H Topcuoglu S Hariri and M-Y Wu ldquoPerformance-effectiveand low-complexity task scheduling for heterogeneous comput-ingrdquo IEEE Transactions on Parallel and Distributed Systems vol13 no 3 pp 260ndash274 2002

[4] Y-K Kwok and I Ahmad ldquoDynamic critical-path schedulingan effective technique for allocating task graphs to multiproces-sorsrdquo IEEETransactions on Parallel andDistributed Systems vol7 no 5 pp 506ndash521 1996

[5] F Suter F Desprez and H Casanova ldquoFrom heterogeneoustask scheduling to heterogeneous mixed parallel schedulingrdquo inEuro-Par 2004 Parallel Processing 10th International Euro-ParConference Pisa Italy August 31- September 3 2004 Proceed-ings vol 3149 of Lecture Notes in Computer Science pp 230ndash237Springer Berlin Germany 2004

[6] C-Y Lee J J Hwang Y-C Chow and F D Anger ldquoMultipro-cessor scheduling with interprocessor communication delaysrdquoOperations Research Letters vol 7 no 3 pp 141ndash147 1988

[7] M Maheswaran and H J Siegel ldquoA dynamic matching andscheduling algorithm for heterogeneous computing systemsrdquo inProceedings of the 7th Heterogeneous Computing Workshop pp57ndash69 IEEE Computer Society Orlando Fla USAMarch 1998

[8] M-Y Wu and D D Gajski ldquoHypertool a programming aidformessage-passing systemsrdquo IEEE Transactions on Parallel andDistributed Systems vol 1 no 3 pp 330ndash343 1990

[9] H El-Rewini and T G Lewis ldquoScheduling parallel programtasks onto arbitrary target machinesrdquo Journal of Parallel andDistributed Computing vol 9 no 2 pp 138ndash153 1990

[10] G C Sih and E A Lee ldquoCompile-time scheduling heuristic forinterconnection-constrained heterogeneous processor archi-tecturesrdquo IEEE Transactions on Parallel andDistributed Systemsvol 4 no 2 pp 175ndash187 1993

[11] B Kruatrachue and T Lewis ldquoGrain size determination forparallel processingrdquo IEEE Software vol 5 no 1 pp 23ndash32 1988

[12] M A Al-Mouhamed ldquoLower bound on the number of proces-sors and time for scheduling precedence graphs with communi-cation costsrdquo IEEETransactions on Software Engineering vol 16no 12 pp 1390ndash1401 1990

[13] A Tumeo C Pilato F Ferrandi D Sciuto and P L LanzildquoAnt colony optimization for mapping and scheduling in het-erogeneous multiprocessor systemsrdquo in Proceedings of the Inter-national Conference on Embedded Computer Systems Architec-turesModeling and Simulation (SAMOS rsquo08) pp 142ndash149 IEEEComputer Society Samos Greece June 2008

[14] I Ahmad and M K Dhodhi ldquoMultiprocessor scheduling in agenetic paradigmrdquo Parallel Computing vol 22 no 3 pp 395ndash406 1996

[15] M R Bonyadi and M E Moghaddam ldquoA bipartite geneticalgorithm for multi-processor task schedulingrdquo InternationalJournal of Parallel Programming vol 37 no 5 pp 462ndash487 2009

[16] R C Correa A Ferreira and P Rebreyend ldquoScheduling multi-processor tasks with genetic algorithmsrdquo IEEE Transactions onParallel andDistributed Systems vol 10 no 8 pp 825ndash837 1999

[17] E S H Hou N Ansari and H Ren ldquoGenetic algorithm formultiprocessor schedulingrdquo IEEE Transactions on Parallel andDistributed Systems vol 5 no 2 pp 113ndash120 1994

[18] F A Omara and M M Arafa ldquoGenetic algorithms for taskscheduling problemrdquo Journal of Parallel and Distributed Com-puting vol 70 no 1 pp 13ndash22 2010

[19] A Singh M Sevaux and A Rossi ldquoA hybrid grouping geneticalgorithm for multiprocessor schedulingrdquo in ContemporaryComputing Second International Conference IC3 2009 NoidaIndia August 17ndash19 2009 Proceedings vol 40 of Communica-tions in Computer and Information Science pp 1ndash7 SpringerBerlin Germany 2009

[20] A S Wu H Yu S Jin K-C Lin and G Schiavone ldquoAnincremental genetic algorithm approach to multiprocessorschedulingrdquo IEEE Transactions on Parallel and DistributedSystems vol 15 no 9 pp 824ndash834 2004

[21] Y Wen H Xu and J Yang ldquoA heuristic-based hybrid genetic-variable neighborhood search algorithm for task schedulingin heterogeneous multiprocessor systemrdquo Information Sciencesvol 181 no 3 pp 567ndash581 2011

[22] R Shanmugapriya S Padmavathi and S Shalinie ldquoContentionawareness in task scheduling using Tabu searchrdquo in Proceedingsof the IEEE International Advance Computing Conference pp272ndash277 IEEE Computer Society Patiala India March 2009

[23] S C Porto and C C Ribeiro ldquoA tabu search approach totask scheduling on heterogeneous processors under precedenceconstraintsrdquo International Journal of High Speed Computing vol7 no 1 pp 45ndash72 1995

[24] A V Kalashnikov and V A Kostenko ldquoA parallel algorithm ofsimulated annealing for multiprocessor schedulingrdquo Journal ofComputer and Systems Sciences International vol 47 no 3 pp455ndash463 2008

[25] A Y S Lam and V O K Li ldquoChemical-reaction-inspiredmeta-heuristic for optimizationrdquo IEEE Transactions on EvolutionaryComputation vol 14 no 3 pp 381ndash399 2010

[26] P Choudhury R Kumar and P P Chakrabarti ldquoHybridscheduling of dynamic task graphs with selective duplicationfor multiprocessors under memory and time constraintsrdquo IEEETransactions on Parallel and Distributed Systems vol 19 no 7pp 967ndash980 2008

[27] Y Xu K Li LHe andT K Truong ldquoADAG scheduling schemeon heterogeneous computing systems using double molecularstructure-based chemical reaction optimizationrdquo Journal ofParallel andDistributed Computing vol 73 no 9 pp 1306ndash13222013

[28] J Xu and A Lam ldquoChemical reaction optimization for the gridscheduling problemrdquo in Proceedings of the IEEE InternationalConference on Communications pp 1ndash5 IEEE Computer Soci-ety Cape Town South Africa May 2010

[29] B Varghese G McKee and V Alexandrov ldquoCan agent intel-ligence be used to achieve fault tolerant parallel computingsystemsrdquo Parallel Processing Letters vol 21 no 4 pp 379ndash3962011

[30] Y Jiang Z Shao and Y Guo ldquoA DAG scheduling scheme onheterogeneous computing systems using tuple-based chemicalreaction optimizationrdquo Scientific World Journal vol 2014 Arti-cle ID 404375 23 pages 2014

[31] J Xu AY S Lam and V O K Li ldquoStock portfolio selectionusing chemical reaction optimizationrdquo in Proceedings of theInternational Conference on Operations Research and FinancialEngineering pp 458ndash463 WASET Paris France June 2011

20 Mathematical Problems in Engineering

[32] N Mladenovic and P Hansen ldquoVariable neighborhood searchrdquoComputers amp Operations Research vol 24 no 11 pp 1097ndash11001997

[33] K Li X Tang and K Li ldquoEnergy-efficient stochastic taskscheduling on heterogeneous computing systemsrdquo IEEE Trans-actions on Parallel and Distributed Systems vol 25 no 11 pp2867ndash2876 2014

[34] D HWolpert andWGMacready ldquoNo free lunch theorems foroptimizationrdquo IEEE Transactions on Evolutionary Computationvol 1 no 1 pp 67ndash82 1997

[35] M A Khan ldquoScheduling for heterogeneous Systems usingconstrained critical pathsrdquo Parallel Computing vol 38 no 4-5pp 175ndash193 2012

[36] P Hansen N Mladenovic and J M Perez ldquoVariable neigh-borhood search methods and applicationsrdquo 4OR-A QuarterlyJournal of Operations Research vol 6 no 4 pp 319ndash360 2008

[37] F Glover and G Kochenberger Handbook of MetaheuristicsSpringer New York NY USA 2003

[38] S Kim and J Browne ldquoA general approach to mapping of par-allel computation upon multiprocessor architecturesrdquo in Pro-ceedings of the International Conference on Parallel Processingpp 1ndash8 Pennsylvania State University Pennsylvania State Uni-versity Press University Park Pa USA August 1988

[39] V A F Almeida I M M Vasconcelos J N C Arabe and D AMenasce ldquoUsing random task graphs to investigate the potentialbenefits of heterogeneity in parallel systemsrdquo in Proceedingsof the ACMIEEE Conference on Supercomputing pp 683ndash691IEEE Computer Society Minneapolis Minn USA November1992

[40] S Garcıa A Fernandez J Luengo and F Herrera ldquoAdvancednonparametric tests for multiple comparisons in the design ofexperiments in computational intelligence and data miningexperimental analysis of powerrdquo Information Sciences vol 180no 10 pp 2044ndash2064 2010

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 10: Research Article DRSCRO: A Metaheuristic Algorithm for ...downloads.hindawi.com/journals/mpe/2015/396582.pdf · Research Article DRSCRO: A Metaheuristic Algorithm for Task Scheduling

10 Mathematical Problems in Engineering

(1) pop subset = InitVNS(pop set)(2) Select the set of neighborhood structures 119873119890119894119892ℎ

119889(119889 = 1 2 3 119889max)

(3) for each individualm in the pop subset do(4) d = 1(5) while 119889 lt 119889max do(6) Randomly generate a moleculem

1from the 119889th neighborhood ofm

(7) Apply some local search method withm1as the initial molecule (the local optimum presented bym

2)

(8) If m2is better thanm

(9) m = m2

(10) 119889 = 1(11) else(12) d = d + 1(13) end if(14) end while(15) until the termination condition is satisfied(16) end for(17) execute the combination strategy

Algorithm 4 InitMoleGOVNS(pop set) generating the initial population of the global optimization phase

(1) for each processor 119901119894in the solution 120596 do

(2) Compute Tend(pi)(3) end for(4) Choose the processor 119901max with the largest Tend(119901max)(5) Randomly choose a task Vrandom from exec(119901max)(6) Randomly choose a processor 119901random different from 119901max(7) Reallocate Vrandom to the processor 119901random(8) Encode and reschedule the changed solution 1205961015840(9) return 1205961015840

Algorithm 5 GenNeighborhoodofLoadBalance(120596)

So different from the original ones in [21] we develop anew model for processor selection Let TCload(119901

119894) be all

the task execution cost of processor 119901119894 and TCcomm(119901

119894)

is the communication cost overhead of processor 119901119894as

defined in [21] The values of TCload(119901119894) and TCcomm(119901

119894)

are the tendencies of load balancing and communicationreducing respectively (ie the tendency of task reducing orincreasing) The greater TCload(119901

119894) is the stronger tendency

of reducing tasks on 119901119894is and the greater TCcomm(119901

119894) is the

stronger tendency of increasing tasks on 119901119894is Therefore a

parameter Tend(119901119894) is developed to measure the tendency

with the combination of TCload(119901119894) and TCcomm(119901

119894) as (11)

The neighborhood structure computation processes of loadbalance and communication reduction are as presented inAlgorithms 5 and 6 respectively The proposed model isunder the comprehensive consideration of both intuitionsand can make the VNS algorithm more effective than theoriginal one

Tend (119901119894) =

1

1 + 119890minus(TCload(119901119894)TCcomm(119901119894)) (11)

The VNS algorithm in DRSCRO utilizes the dual termi-nation criteria The termination criterion 1 sets the upperbound of the local search iterations to 20 and the termination

criterion 2 sets the maximum iteration number withoutimprovement to 3 The VNS algorithm will stop if eithercriterion is satisfied To form a new initial population acombination strategy is utilized for combining the currentpopulation and the VNS output after the VNS algorithmoutputs the subset of the population The current populationand the VNS output are first merged and sorted by increasingmakespan then the first PopSize molecules are selected togenerate the new initial population

442 Elementary Chemical Reaction Operators The opera-tors for global optimization not only vary 119901

119894of each tuple

but also interchange the positions of the tuples in a moleculeas the intensification searches or the diversification searches[25] to optimize the whole solution

On-wall ineffective collision (as an intensificationsearch) decomposition (as a diversification search) andsynthesis (as a diversification search) are as presented in [30]and we do not repeat them here to focus on our main workIn [30] the function of the ineffective collision operator issimilar to that of the on-wall ineffective collision operatorTherefore different from [30] amodified ineffective collisionoperator is proposed in this paper and it utilized the loadbalance neighborhood structure used in the VNS mentioned

Mathematical Problems in Engineering 11

(1) for each processor pi in the solution 120596 do(2) Compute Tend(pi)(3) end for(4) Choose the processor 119901min with the smallest Tend(119901min)(5) Set the candidate set cand empty(6) for each task vi in the set exec(119901min) do(7) Compute the set predecessor(vi)(8) Update predecessor(vi) with predecessor(vi) = predecessor(vi) minus exec(119901min)(9) cand = cand + predecessor(vi)(10) end for(11) Randomly choose a task Vrandom from cand(12) Reallocate Vrandom to the processor 119901min(13) Encode and reschedule the changed solution 1205961015840(14) return 1205961015840

Algorithm 6 GenNeighborhoodofCommReduction(120596)

(1) choose randomly a tuple (vi f i pi) inm1where f i = 0

(2) exchange the positions of (vi f i pi) and (V119894minus1119891119894minus1 119901119894minus1)

(3) modify 119891119894minus1 f i and 119891

119894+1inm1as defined in the last paragraph of Section 421

(4) generate a new moleculem10158401= GenLBNeighborhood(m

1)

(5) choose randomly a tuple (vi f i pi) inm2where f j = 0

(6) exchange the positions of (vi f i pi) and (V119895minus1

119891119895minus1

119901119895minus1

)(7) modify 119891

119895minus1 f j and 119891

119895+1inm2as defined in the last paragraph of Section 421

(8) generate a new moleculem10158402= GenLBNeighborhood(m

2)

Algorithm 7 IntermoleGO(m1m2)

Table 3 Execution cost of DAG tasks by each processor

Tasks 1199011

1199012

1199013

v1

7 8 9v2

12 14 16v3

14 15 16v4

13 3 14

before to promote the intensification capability of DRSCROand avoid the function duplication

The operator IntermoleGO (ie the ineffective collisionoperator) is used to generate new molecules m1015840

1and m1015840

2

from given moleculesm1andm

2 This operator first uses the

steps in OnWallGO to generate m10158401from m

1 and then the

operator generate the other newmoleculem10158402fromm

2in the

similar fashion In the end the operator generates two newmoleculesm1015840

1andm1015840

2fromm

1andm

2as an intensification

searchThe detailed executions are presented in Algorithm 7Figure 7 shows the example of the IntermoleGO in which themolecules correspond to the DAG as shown in Figure 1(a)

45 Illustrative Example Consider the example shown inFigure 1(a) Its edges are labeled with the communicationcosts whereas the execution costs are shown in Table 3

Initially the path (V1 V2 V4 V3) is found based on

the upward rank value of each task in DAG and the firstmolecule m = ((V

1 0 119901

1) (V2 1 1199011) (V4 0 119901

1) and (V

3

1 1199011)) can be obtained Algorithm 2 InitMoleSMS is then

executed to generate the initial population with 10 elements(ie PopSize is set as 10) for super molecule selection phaseThe initial population molecules are operated during theiterations in the super molecule selection phase as presentedin the framework of DRSCRO in Section 41 and the supermolecule SMole = ((V

1 0 1199013) (V2 1 1199011) (V4 0 1199012) and (V

3

1 1199011)) can be obtainedIn the global optimization phase Algorithm 4 InitMole-

GOVNS is then executed to generate (or to update) the initialpopulation after each iteration as presented in Section 41The molecules are operated during the iterations in theglobal optimization phase as presented in the framework ofDRSCRO in Section 41 and the global minimalmakespan =40 is finally obtained for which the corresponding solution(ie molecule) is ((V

1 0 1199012) (V4 1 1199012) (V2 0 1199012) and (V

3 1

1199012))

46 Analysis of DRSCRO As a new metaheuristic strategythe CRO-based methods for DAG scheduling which isproposed very recently have demonstrated the capability forsolving this kind of NP-hard optimization problems By ana-lyzing the framework molecular structure chemical reactionoperators and the operational environment in DRSCRO itcan be shown to some extent that DRSCRO scheme has theadvantage of three points in comparison with other CRO-based algorithms for DAG scheduling

12 Mathematical Problems in Engineering

New molecule 1

New molecule 2

Molecule 1

Molecule 2

(1 0 p2) (4 1 p2) (2 0 p3) (3 0 p2)

(3 0 p2)(4 1 p2)(2 1 p2)

(1 0 p1)

(1 0 p2)

(1 0 p1) (4 1 p1) (2 0 p1)

(4 0 p2)(2 1 p2)

(3 1 p1)

(3 0 p1)

Figure 7 Example of IntermoleGO

First to some degree super molecule in DRSCRO issimilar to InitS in TMSCRO [30] or the ldquoeliterdquo in GA[31] However the ldquoeliterdquo in GA is usually generated fromtwo chromosomes while super molecule is approached byexecuting the first phase of DRSCRO Moreover in compari-son with TMSCRO DRSCRO uses a metaheuristic strategy(CRO) to get a better super molecule It is because asintelligent random-search algorithm CRO used in the phaseof DRSCRO for super molecule selection searches a widerarea of the solution space than CEFT applied in TMSCROwhich narrow the search down to a very small portion ofthe solution space As a result a better super molecule maycontribute to a better global optimum solution and accel-erates convergence Second DAG scheduling problem hastwo complex aspects including task sequence optimizationand processor assignment optimization which lead to avery large and complicated solution space So for a bettercapability of intensification search than other CRO-basedalgorithms for DAG scheduling on heterogeneous systemsDRSCRO applied VNS algorithm as the initialization of theglobal optimization phase which is also as a local searchprocess during the running of DRSCRO and one of theneighborhood structures of VNS is also utilized in theineffective reaction operator Moreover during the runningof DRSCRO the task priority is changeable in our adoptedVNS algorithm and a new model for processor selection isalso utilized in the neighborhood structures for promotingefficiency of VNS different from the VNS proposed in [22]All of three advantages as previously mentioned enhance theability to get better rapidity of convergence and better searchresult in the whole solution space which is demonstrated bythe experimental results in Sections 53 and 54 The time

complexity of DRSCRO is 119874(iter times (2 times |119881| + 4 times |119864| times |119875| +

119889max times 119904119906119887119901119900119901119873119906119898))

5 Experimental Details

In this section the simulation experiment and comparativeevaluation of HEFT DMSCRO TMSCRO and proposedDRSCRO are presented As presented in [27 30] by theoryanalysis and experimental results TMSCRO and DMSCROproved to have better performance than GA therefore ourwork is the further study of CRO-based algorithms for DAGscheduling on heterogeneous systems and for DRSCRO asa metaheuristic algorithm we focus on the performance ofour proposed algorithm itself and the comparison betweenDRSCRO and other similar kinds of algorithms

First two extensive sets of graphs as the test beds forcomparative study are described Next the parameter settingswhich are used in the simulation experiments are pre-sented The results and analysis of the experiment includingmakespan test and convergence rate test are given in the finalpart

51 Test Bed As presented in [27 30] two extensive setsof DAGs real-world application and randomly generatedapplication graphs are considered as the test beds in theexperiments to enhance the comparability of various algo-rithms The first extensive test bed is two real-world problemDAGs molecular dynamics code [38] and Gaussian elimi-nation [8] Molecular dynamics are a computer simulationof physical movements of the molecules and atoms whichare allowed to interact for a period of time in the contextof N-body simulation Molecular dynamics code DAG isshown in Figure 8 Gaussian elimination is used to calculatethe solution for a linear equation system which is appliedsystematically to convert row operations on a set of linearequations to the upper triangular form As shown in Figure 9the total number of tasks in the Gaussian elimination DAGwith the matrix size of 7 is 27 and the largest task numberat the same level is 6 The reason of the utilization of thesetwo application graphs as a test bed is not only to enhancethe comparability of various algorithms but also to showthe function application of our proposed algorithm as anillustrative demonstration without loss of generality Thesecond extensive test bed for comparative study is the DAGsof random graphs A random graph generator presentedin [39] is implemented to generate random graphs in thesimulation experiment It allows the user to generate a varietyof random graphs with different characteristics such as CCRthe amount of calculation of a task the successor numberof a task and the total number of tasks in a random graphIt is also assumed that all tasks and communication linkshave the same computation cost and communication costrespectively

As shown in Figure 2 the next phase criteria and stoppingcriteria of DRSCRO are that the makespan stays unchangedfor 5000 consecutive iterations in the search loop And thestopping criterion of TMSCRO and DMSCRO is that themakespan remains the same for 10000 consecutive iterations

Mathematical Problems in Engineering 13

Entry

Exit

40

38

36

39

37

3433

27 28

2120

35

31 323029

22 23 24 25 26

321

4 5 6 7

131211 14

8 9 10

15 16 17 18 19

0

Figure 8 A molecular dynamics code DAG

52 Parameter Setting In the experiments a parameter hlis set to represent the heterogeneity level as presented inthe first paragraph of Section 3 It complies with the MHMmodel assumption and results in the fact that speeds of acomputing processor are different for different tasks In doingso the heterogeneity level (1 + hl)(1 minus hl) is equal to thebiggest possible ratio of the best processor speed to the worstprocessor speed for each task hl is set as the value tomake theheterogeneity level 2 unless otherwise specified in this paper

The details of parameter setting are shown in Table 4Theparameters 6ndash12 which are the CRO-based algorithms testedin the simulation are set as presented in [25]

53 Makespan Tests The performance of the proposed algo-rithm is compared with two state-of-the-art CRO-basedscheduling algorithms DMSCRO and TMSCRO and aheuristic algorithm HEFT Each makespan value plotted inthe graphs is the average value of a number of independentruns In the first extensive test bed themakespan is averagedover 10 independent runs (HEFT is run only once as adeterministic algorithm) while in the second extensive test

Exit

0

1 2 3 4 5 6

7

8

13

14

18

19

22

23

25

26

24

20 21

15 16 17

9 10 11 12

Entry

Figure 9 A Gaussian elimination DAG for a matrix of size 7

Table 4 Parameter values for simulation experiment

Parameter ValueCCR 01 02 1 2 5Number of processors 4 8 16 32

hl 0333Successor number of a task in a random graph 1 2 3 4

Total number of tasks in a random graph 10 20 50

InitialKE 1000120579 500120599 10Buffer 200KELossRate 02MoleColl 02PopSize 10

bed the makespan is averaged over 30 different randomgraph running instances Moreover to prove the robustness

14 Mathematical Problems in Engineering

0

20

40

60

80

100

120

140

CCR = 10

Mak

espa

n

HEFTDMSCRO

TMSCRODRSCRO

|P| = 4 |P| = 8 |P| = 16 |P| = 32

Figure 10 Average makespan for the molecular dynamics codeCCR = 10

0

20

40

60

80

100

120

140

CCR = 02

Mak

espa

n

HEFTDMSCRO

TMSCRODRSCRO

|P| = 4 |P| = 8 |P| = 16 |P| = 32

Figure 11 Average makespan for Gaussian elimination CCR = 02

of DRSCRO the best final value achieved in all these runsthe worst final value and the related standard deviation orvariance are also presented

531 Real-World Application Graphs Figures 10ndash13 showthe simulation experiment results of DRSCRO DMSCROTMSCRO and HEFT on the real-world application graphsand Tables 4ndash7 list the detail of the experimental results

As shown in Figures 10 and 11 it can be observed thatthe average makespan decreases as the processor numberincreases The results also show that DRSCRO TMSCROand DMSCRO achieve very similar performance which areall metaheuristic methods It is because according to No-Free-Lunch Theorem all well-designed metaheuristic meth-ods have the same performance on searching for optimalsolutions when averaged over all possible fitness functionsThe TMSCRO andDMSCROused in the simulation are well-designed and taken from the literature Therefore it provedthat DRSCRO developed in our work is also well-designed

A close observation of the results in Tables 10 and 11shows that DRSCRO outperforms TMSCRO and DMSCROon average slightly The reason is only because DRSCRO hasbetter capability of intensification search by applying VNS

01 02 1 2 50

50100150200250300350400450

CCR

Mak

espa

n

HEFTDMSCRO

TMSCRODRSCRO

Figure 12 Average makespan for the molecular dynamics code theprocessors number is 16

01 02 1 2 50

50100150200250300350400450

CCR

Mak

espa

n

HEFTDMSCRO

TMSCRODRSCRO

Figure 13 Average makespan for Gaussian elimination the proces-sors number is 8

and the utilization of one of its neighborhood structuresin the ineffective reaction operator as presented in the lastparagraph of Section 46 Therefore the performance of theaverage results obtained by DRSCRO is better than thatobtained by TMSCRO and DMSCRO when the stoppingcriterion is satisfied Moreover DRSCRO TMSCRO andDMSCRO typically outperform HEFT because they search awider area of the solution space as metaheuristic methodswhile the search of HEFT is narrowed down to a very smallerportion by means of the heuristics

Figures 12 and 13 and the results in Tables 7 and 8show the performance of the experimental results of thesefour algorithms with CCR value increasing It can be seenthat the makespan on average increases with the CCR valueincreasing It is because the heterogeneous processors are inthe idle state for longer as a result of the DAGs becomingmore communication-intensive It also can be observed thatDRSCRO TMSCRO and DMSCRO outperform HEFT andthe advantage becomes more significant with the value ofCCR increasing which suggest that heuristic algorithm likeHEFT has less consistent performance in a wide scheduling

Mathematical Problems in Engineering 15

Table 5 Experiment results for the molecular dynamics code CCR = 10

The number ofprocessors

HEFT(averagemakespan)

DMSCRO(averagemakespan)

TMSCRO(averagemakespan)

DRSCRO(averagemakespan)

DRSCRO(best

makespan)

DRSCRO(worst

makespan)

DRSCRO(variance)

4 120086 117624 116993 116759 114365 119589 24878 120074 116633 116007 115775 113402 118582 246616 120031 116101 115478 115247 112885 118041 245532 119993 115476 114856 114529 112182 117306 2440

Table 6 Experiment results for Gaussian elimination graph CCR = 02

The number ofprocessors

HEFT(averagemakespan)

DMSCRO(averagemakespan)

TMSCRO(averagemakespan)

DRSCRO(averagemakespan)

DRSCRO(best

makespan)

DRSCRO(worst

makespan)

DRSCRO(variance)

4 126852 124252 123585 123337 122042 126327 17698 96952 94965 94455 94266 92333 96551 200816 82356 80668 80235 80074 78433 82015 170632 75630 74080 73682 73535 72027 75317 1566

Table 7 Experiment results for the molecular dynamics code the processors number is 16

The value ofCCR

HEFT(averagemakespan)

DMSCRO(averagemakespan)

TMSCRO(averagemakespan)

DRSCRO(averagemakespan)

DRSCRO(best

makespan)

DRSCRO(worst

makespan)

DRSCRO(variance)

01 111782 109491 108903 108685 106457 111320 231502 112034 109737 109148 108930 106697 111571 23201 120031 116101 115478 115247 112885 118041 24552 160760 157465 156619 156306 153102 158532 30345 414581 406082 403902 403095 400878 404805 2125

Table 8 Experiment results for Gaussian elimination graph the processors number is 8

The value ofCCR

HEFT(averagemakespan)

DMSCRO(averagemakespan)

TMSCRO(averagemakespan)

DRSCRO(averagemakespan)

DRSCRO(best

makespan)

DRSCRO(worst

makespan)

DRSCRO(variance)

01 96612 94632 94124 93935 92010 96212 200102 96952 94965 94455 94266 92333 96551 20081 131094 128407 127717 127462 124849 130552 27152 222635 218071 216900 216467 214194 219550 24565 403600 395327 393204 392418 390260 394083 2069

scenario range and metaheuristic algorithm performs moreeffectively for communication-intensive DAGs

532 Randomly Generated Application Graphs As shownin Figures 14ndash16 randomly generated DAGs are used toevaluate the performance of DRSCRO TMSCRO DMSCROand HEFT in these experiments And the details of theseexperimental results are listed in Tables 9ndash11

Figure 14 shows the performance on the experimentalresults of these four algorithms with the processor numberincreasing As shown in Figure 14 DRSCRO always out-performs TMSCRO DMSCRO and HEFT as the numberof processors increases Figure 15 shows that DMSCRO has

better performance than the other three algorithms as thetask number increases The reasons for these are similarto those explained in the third paragraph of Section 531Figure 16 shows the makespan on average with CCR valuesincreasing It can be seem that the averagemakespan increasesrapidly with the increasing of the value of CCR As shown inFigure 16 the makespan on average increases rapidly whenthe value of CCR rises It is the fact that the DAG becomesmore communication-intensive with CCR increasing whichleads to the processors staying in the idle state for longer

54 Convergence Tests In this section the convergenceexperiments are conducted to show the change of makespan

16 Mathematical Problems in Engineering

Table 9 Experiment results for random graphs under different processor numbers task number is 50 and CCR = 02

The number ofprocessors

HEFT(averagemakespan)

DMSCRO(averagemakespan)

TMSCRO(averagemakespan)

DRSCRO(averagemakespan)

DRSCRO(best

makespan)

DRSCRO(worst

makespan)

DRSCRO(variance)

4 149400 146337 145552 145261 142283 148782 30948 119511 117061 116433 116200 113818 119017 247516 119473 117024 116396 116163 113782 118979 247432 119468 115550 114929 114700 112348 117480 2443

Table 10 Experiment results for random graphs under different task numbers processors number is 32 and CCR = 10

The numberof tasks

HEFT(averagemakespan)

DMSCRO(averagemakespan)

TMSCRO(averagemakespan)

DRSCRO(averagemakespan)

DRSCRO(best

makespan)

DRSCRO(worst

makespan)

DRSCRO(variance)

10 714484 706149 699100 690708 688982 693638 202520 1100918 1089685 1080428 1069082 1066410 1071479 261950 1666373 1662543 1649988 1634230 1633415 1636261 1165

Table 11 Experiment results of DRSCRO for the random graph under different CCRs the task number is 50

Value of CCR Processors number is 4 Processors number is 8 Processors number is 16 Processors number is 3201 145093 116068 116058 11459202 145261 116200 116163 1147001 153248 120228 119511 1185342 194401 166014 164246 1622925 508759 498427 492049 487830

0

50

100

150

CCR = 02

Mak

espa

n

HEFTDMSCRO

TMSCRODRSCRO

|P| = 4 |P| = 8 |P| = 16 |P| = 32

Figure 14 Average makespan for random graphs under differentprocessor numbers task number is 50 and CCR = 02

among DRSCRO TMSCRO and DMSCRO The conver-gence traces and significant tests are to further reveal thedifferences between DRSCRO and the other two algorithmsIn these experiments as suggested in [27] the stoppingcriteria of these three algorithms are that the total runningtime reaches a setting value (eg 180 s) Under the consid-eration of comparability the beginning of the time countingof DRSCRO is set as the start of the global optimizationphase processing In the first extensive test bed themakespanis averaged over 10 independent runs while in the secondextensive test bed the makespan is averaged over 30 differentrandom graph running instances

10 20 500

200400600800

10001200140016001800

The number of tasks

Mak

espa

n

HEFTDMSCRO

TMSCRODRSCRO

Figure 15 Average makespan for random graphs under differenttask numbers processors number is 32 and CCR = 10

541 Convergence Trace The convergence traces ofDRSCRO TMSCRO and DMSCRO for processing themolecular dynamics code and Gaussian elimination areplotted in Figures 17 and 18 respectively Figures 19ndash21show the convergence traces when processing the randomlygenerated DAG sets of which each contains 10 20 and50 tasks respectively As shown in Figures 17ndash21 it canbe observed that the convergence traces of these threealgorithms have obvious differences And the DRSCROconverges faster than the other two algorithms in every caseThe reason for the better rate of convergence of DRSCRO is as

Mathematical Problems in Engineering 17

01 02 1 2 50

100

200

300

400

500

600

CCR

Mak

espa

n

P = 32

P = 16P = 8P = 4

Figure 16 Average makespan of DRSCRO for the random graphunder different CCRs the task number is 50

0 1 2 3 4 5 6 7 8 9134

135

136

137

138

139

140

141

Time (Ms)

Mak

espa

n

DMSCROTMSCRO

DRSCRO

times104

Figure 17 Convergence trace for the molecular dynamics codeCCR = 1 and the number of processors is 16

Time (Ms)0 1 2 3 4 5 6 7 8 9

167

168

169

170

171

172

173

174

Mak

espa

n

DMSCROTMSCRO

DRSCRO

times104

Figure 18 Convergence trace for Gaussian elimination CCR = 02and the number of processors is 8

presented in the last paragraph of Section 46 (ie DRSCROtakes the advantage of its double-reaction structure toobtain a better super molecule for accelerating convergence)Even though the VNS algorithm adds the time cost in eachiteration the enhanced optimization capability of DRSCROalso makes it obtain a better coverage rate than TMSCRO

Time (Ms)0 1 2 3 4 5 6 7

505

51

515

52

525

53

535

54

Mak

espa

n

DMSCROTMSCRO

DRSCRO

times104

Figure 19 Convergence trace for the set of the randomly generatedDAGs with 10 tasks

Time (Ms)0 2 4 6 8 10 12 14

65566

66567

67568

68569

69570

705

Mak

espa

n

DMSCROTMSCRO

DRSCRO

times104

Figure 20 Convergence trace for the set of the randomly generatedDAGs with 20 tasks

Time (Ms)0 05 1 15 2 25 3 35 4 45

179

180

181

182

183

184

185

186

Mak

espa

n

DMSCROTMSCRO

DRSCRO

times105

Figure 21 Convergence trace for the set of the randomly generatedDAGs with 50 tasks

and DMSCRO The simulation experimental results showthat DRSCRO converges faster than TMSCRO by 194 onaverage (by 293 in the best case) and faster than DMSCROby 339 on average (by 412 in the best case)

Moreover the statistical analysis based on the averagevalues achieved is also presented in Section 542 to prove

18 Mathematical Problems in Engineering

Table 12 Results of Friedman tests 120572 = 005

Method Algorithm 119901 value Hypothesis

Friedman testDRSCRO DMSCRO 253119864 minus 02 RejectedDRSCRO TMSCRO 253119864 minus 02 Rejected

DRSCRO DMSCRO TMSCRO 670119864 minus 03 Rejected

Table 13 Results of Quade tests 120572 = 005

Method Algorithm 119901 value Hypothesis

Quade testDRSCRO DMSCRO 132119864 minus 02 RejectedDRSCRO TMSCRO 132119864 minus 02 Rejected

DRSCRO DMSCRO TMSCRO 109119864 minus 03 Rejected

that DRSCRO outperforms the other CRO-based algorithmsfor DAG scheduling from a statistical point of view

542 Significant Tests Statistical analysis is necessary for theaverage coverage rates obtained in all cases by DRSCROTMSCRO and DMSCRO which are metaheuristic methodsin order to find significant differences among these resultsNonparametric tests according to the recommendationsin [40] are specifically considered to be used since theexperimental results may present neither normal distributionnor variance homogeneity Therefore the Friedman test andthe Quade test are applied to check whether significantdifferences exist in the performance between these threealgorithms A significance level 120572 = 005 is used in allstatistical tests

Tables 12 and 13 respectively list the test results of theFriedman test and the Quade test which both reject the nullhypothesis of equivalent performance In both of these twotests our proposedDRSCRO is not only compared against allthe algorithms but also compared against the remaining onesas the control methodThe results in Tables 12 and 13 validatethe significant differences 120572 in the performance of DRSCROTMSCRO and DMSCRO

In sum it could be concluded that DRSCRO which is thecontrol algorithm statistically outperforms the other CRO-based DAG scheduling algorithm on coverage rate with asignificant level of 005

6 Discussion

The experimental results of makespan tests show that theperformance of DRSCRO is very similar to the other similarkinds of metaheuristic algorithms because when averagedover all possible fitness functions each well-designed meta-heuristic algorithm has the same performance for searchingoptimal solutions according to No-Free-Lunch TheoremHowever the proposed DRSCRO can achieve better perfor-mance and find good solutions faster than the other similarkinds of metaheuristic algorithms as the experimental resultsof convergence tests and the reason for it as the analysis inthe last paragraph in Section 46 is that DRSCROhas a bettersuper molecule creation bymetaheuristic method and underthe consideration of the optimization of scheduling orderand processor assignment DRSCRO takes the advantages of

VNS algorithm in the global optimization phase to improvethe optimization capability A load balance neighborhoodstructure is also applied in the ineffective reaction operatorfor a better intensification capability The new processorselection model utilized in the neighborhood structures alsopromotes the efficiency of VNS algorithm

7 Conclusion and Future Study

An algorithm named Double-Reaction-Structured CRO(DRSCRO) is developed for DAG scheduling on heteroge-neous systems in this paper DRSCRO includes two reactionphases one for super molecule selection and another forglobal optimizationThe phase of super molecule selection isused to obtain a super molecule by the metaheuristic methodfor better convergence rate different from other CRO-basedalgorithms forDAG scheduling on heterogeneous systems Inaddition to promote the intersection capability of DRSCROthe VNS algorithm which is with a new model for processorselection utilized in the neighborhood structures is used asthe initialization of global optimization phase and the loadbalance neighborhood structure of VNS is also applied inthe ineffective reaction operator The experimental resultsshow that DRSCRO can also achieve a higher speedup thanthe other CRO-based algorithms as far as we know AndDRSCRO algorithm can also obtain better performance onaveragemakespan in some cases

In future work we will analyze the parameter sensitivityof DRSCRO for promoting its activeness Moreover to makethe proposed algorithmmore practical DRSCROwill be alsoextended to aim at twomain objectives such as (1) minimiza-tion of schedule length (time domain) and (2) minimizationof number of used processors (resource domain)

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

This work is financially supported by the National NaturalScience Foundation of China (Grant no 61462073) and

Mathematical Problems in Engineering 19

the National High Technology Research and DevelopmentProgramof China (863 Program) (Grant no 2015AA020107)

References

[1] T Lewis and H Elrewini Introduction to Parallel ComputingPrentice-Hall New York NY USA 1992

[2] Y-K Kwok and I Ahmad ldquoStatic scheduling algorithms forallocating directed task graphs to multiprocessorsrdquo ACM Com-puting Surveys vol 31 no 4 pp 406ndash471 1999

[3] H Topcuoglu S Hariri and M-Y Wu ldquoPerformance-effectiveand low-complexity task scheduling for heterogeneous comput-ingrdquo IEEE Transactions on Parallel and Distributed Systems vol13 no 3 pp 260ndash274 2002

[4] Y-K Kwok and I Ahmad ldquoDynamic critical-path schedulingan effective technique for allocating task graphs to multiproces-sorsrdquo IEEETransactions on Parallel andDistributed Systems vol7 no 5 pp 506ndash521 1996

[5] F Suter F Desprez and H Casanova ldquoFrom heterogeneoustask scheduling to heterogeneous mixed parallel schedulingrdquo inEuro-Par 2004 Parallel Processing 10th International Euro-ParConference Pisa Italy August 31- September 3 2004 Proceed-ings vol 3149 of Lecture Notes in Computer Science pp 230ndash237Springer Berlin Germany 2004

[6] C-Y Lee J J Hwang Y-C Chow and F D Anger ldquoMultipro-cessor scheduling with interprocessor communication delaysrdquoOperations Research Letters vol 7 no 3 pp 141ndash147 1988

[7] M Maheswaran and H J Siegel ldquoA dynamic matching andscheduling algorithm for heterogeneous computing systemsrdquo inProceedings of the 7th Heterogeneous Computing Workshop pp57ndash69 IEEE Computer Society Orlando Fla USAMarch 1998

[8] M-Y Wu and D D Gajski ldquoHypertool a programming aidformessage-passing systemsrdquo IEEE Transactions on Parallel andDistributed Systems vol 1 no 3 pp 330ndash343 1990

[9] H El-Rewini and T G Lewis ldquoScheduling parallel programtasks onto arbitrary target machinesrdquo Journal of Parallel andDistributed Computing vol 9 no 2 pp 138ndash153 1990

[10] G C Sih and E A Lee ldquoCompile-time scheduling heuristic forinterconnection-constrained heterogeneous processor archi-tecturesrdquo IEEE Transactions on Parallel andDistributed Systemsvol 4 no 2 pp 175ndash187 1993

[11] B Kruatrachue and T Lewis ldquoGrain size determination forparallel processingrdquo IEEE Software vol 5 no 1 pp 23ndash32 1988

[12] M A Al-Mouhamed ldquoLower bound on the number of proces-sors and time for scheduling precedence graphs with communi-cation costsrdquo IEEETransactions on Software Engineering vol 16no 12 pp 1390ndash1401 1990

[13] A Tumeo C Pilato F Ferrandi D Sciuto and P L LanzildquoAnt colony optimization for mapping and scheduling in het-erogeneous multiprocessor systemsrdquo in Proceedings of the Inter-national Conference on Embedded Computer Systems Architec-turesModeling and Simulation (SAMOS rsquo08) pp 142ndash149 IEEEComputer Society Samos Greece June 2008

[14] I Ahmad and M K Dhodhi ldquoMultiprocessor scheduling in agenetic paradigmrdquo Parallel Computing vol 22 no 3 pp 395ndash406 1996

[15] M R Bonyadi and M E Moghaddam ldquoA bipartite geneticalgorithm for multi-processor task schedulingrdquo InternationalJournal of Parallel Programming vol 37 no 5 pp 462ndash487 2009

[16] R C Correa A Ferreira and P Rebreyend ldquoScheduling multi-processor tasks with genetic algorithmsrdquo IEEE Transactions onParallel andDistributed Systems vol 10 no 8 pp 825ndash837 1999

[17] E S H Hou N Ansari and H Ren ldquoGenetic algorithm formultiprocessor schedulingrdquo IEEE Transactions on Parallel andDistributed Systems vol 5 no 2 pp 113ndash120 1994

[18] F A Omara and M M Arafa ldquoGenetic algorithms for taskscheduling problemrdquo Journal of Parallel and Distributed Com-puting vol 70 no 1 pp 13ndash22 2010

[19] A Singh M Sevaux and A Rossi ldquoA hybrid grouping geneticalgorithm for multiprocessor schedulingrdquo in ContemporaryComputing Second International Conference IC3 2009 NoidaIndia August 17ndash19 2009 Proceedings vol 40 of Communica-tions in Computer and Information Science pp 1ndash7 SpringerBerlin Germany 2009

[20] A S Wu H Yu S Jin K-C Lin and G Schiavone ldquoAnincremental genetic algorithm approach to multiprocessorschedulingrdquo IEEE Transactions on Parallel and DistributedSystems vol 15 no 9 pp 824ndash834 2004

[21] Y Wen H Xu and J Yang ldquoA heuristic-based hybrid genetic-variable neighborhood search algorithm for task schedulingin heterogeneous multiprocessor systemrdquo Information Sciencesvol 181 no 3 pp 567ndash581 2011

[22] R Shanmugapriya S Padmavathi and S Shalinie ldquoContentionawareness in task scheduling using Tabu searchrdquo in Proceedingsof the IEEE International Advance Computing Conference pp272ndash277 IEEE Computer Society Patiala India March 2009

[23] S C Porto and C C Ribeiro ldquoA tabu search approach totask scheduling on heterogeneous processors under precedenceconstraintsrdquo International Journal of High Speed Computing vol7 no 1 pp 45ndash72 1995

[24] A V Kalashnikov and V A Kostenko ldquoA parallel algorithm ofsimulated annealing for multiprocessor schedulingrdquo Journal ofComputer and Systems Sciences International vol 47 no 3 pp455ndash463 2008

[25] A Y S Lam and V O K Li ldquoChemical-reaction-inspiredmeta-heuristic for optimizationrdquo IEEE Transactions on EvolutionaryComputation vol 14 no 3 pp 381ndash399 2010

[26] P Choudhury R Kumar and P P Chakrabarti ldquoHybridscheduling of dynamic task graphs with selective duplicationfor multiprocessors under memory and time constraintsrdquo IEEETransactions on Parallel and Distributed Systems vol 19 no 7pp 967ndash980 2008

[27] Y Xu K Li LHe andT K Truong ldquoADAG scheduling schemeon heterogeneous computing systems using double molecularstructure-based chemical reaction optimizationrdquo Journal ofParallel andDistributed Computing vol 73 no 9 pp 1306ndash13222013

[28] J Xu and A Lam ldquoChemical reaction optimization for the gridscheduling problemrdquo in Proceedings of the IEEE InternationalConference on Communications pp 1ndash5 IEEE Computer Soci-ety Cape Town South Africa May 2010

[29] B Varghese G McKee and V Alexandrov ldquoCan agent intel-ligence be used to achieve fault tolerant parallel computingsystemsrdquo Parallel Processing Letters vol 21 no 4 pp 379ndash3962011

[30] Y Jiang Z Shao and Y Guo ldquoA DAG scheduling scheme onheterogeneous computing systems using tuple-based chemicalreaction optimizationrdquo Scientific World Journal vol 2014 Arti-cle ID 404375 23 pages 2014

[31] J Xu AY S Lam and V O K Li ldquoStock portfolio selectionusing chemical reaction optimizationrdquo in Proceedings of theInternational Conference on Operations Research and FinancialEngineering pp 458ndash463 WASET Paris France June 2011

20 Mathematical Problems in Engineering

[32] N Mladenovic and P Hansen ldquoVariable neighborhood searchrdquoComputers amp Operations Research vol 24 no 11 pp 1097ndash11001997

[33] K Li X Tang and K Li ldquoEnergy-efficient stochastic taskscheduling on heterogeneous computing systemsrdquo IEEE Trans-actions on Parallel and Distributed Systems vol 25 no 11 pp2867ndash2876 2014

[34] D HWolpert andWGMacready ldquoNo free lunch theorems foroptimizationrdquo IEEE Transactions on Evolutionary Computationvol 1 no 1 pp 67ndash82 1997

[35] M A Khan ldquoScheduling for heterogeneous Systems usingconstrained critical pathsrdquo Parallel Computing vol 38 no 4-5pp 175ndash193 2012

[36] P Hansen N Mladenovic and J M Perez ldquoVariable neigh-borhood search methods and applicationsrdquo 4OR-A QuarterlyJournal of Operations Research vol 6 no 4 pp 319ndash360 2008

[37] F Glover and G Kochenberger Handbook of MetaheuristicsSpringer New York NY USA 2003

[38] S Kim and J Browne ldquoA general approach to mapping of par-allel computation upon multiprocessor architecturesrdquo in Pro-ceedings of the International Conference on Parallel Processingpp 1ndash8 Pennsylvania State University Pennsylvania State Uni-versity Press University Park Pa USA August 1988

[39] V A F Almeida I M M Vasconcelos J N C Arabe and D AMenasce ldquoUsing random task graphs to investigate the potentialbenefits of heterogeneity in parallel systemsrdquo in Proceedingsof the ACMIEEE Conference on Supercomputing pp 683ndash691IEEE Computer Society Minneapolis Minn USA November1992

[40] S Garcıa A Fernandez J Luengo and F Herrera ldquoAdvancednonparametric tests for multiple comparisons in the design ofexperiments in computational intelligence and data miningexperimental analysis of powerrdquo Information Sciences vol 180no 10 pp 2044ndash2064 2010

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 11: Research Article DRSCRO: A Metaheuristic Algorithm for ...downloads.hindawi.com/journals/mpe/2015/396582.pdf · Research Article DRSCRO: A Metaheuristic Algorithm for Task Scheduling

Mathematical Problems in Engineering 11

(1) for each processor pi in the solution 120596 do(2) Compute Tend(pi)(3) end for(4) Choose the processor 119901min with the smallest Tend(119901min)(5) Set the candidate set cand empty(6) for each task vi in the set exec(119901min) do(7) Compute the set predecessor(vi)(8) Update predecessor(vi) with predecessor(vi) = predecessor(vi) minus exec(119901min)(9) cand = cand + predecessor(vi)(10) end for(11) Randomly choose a task Vrandom from cand(12) Reallocate Vrandom to the processor 119901min(13) Encode and reschedule the changed solution 1205961015840(14) return 1205961015840

Algorithm 6 GenNeighborhoodofCommReduction(120596)

(1) choose randomly a tuple (vi f i pi) inm1where f i = 0

(2) exchange the positions of (vi f i pi) and (V119894minus1119891119894minus1 119901119894minus1)

(3) modify 119891119894minus1 f i and 119891

119894+1inm1as defined in the last paragraph of Section 421

(4) generate a new moleculem10158401= GenLBNeighborhood(m

1)

(5) choose randomly a tuple (vi f i pi) inm2where f j = 0

(6) exchange the positions of (vi f i pi) and (V119895minus1

119891119895minus1

119901119895minus1

)(7) modify 119891

119895minus1 f j and 119891

119895+1inm2as defined in the last paragraph of Section 421

(8) generate a new moleculem10158402= GenLBNeighborhood(m

2)

Algorithm 7 IntermoleGO(m1m2)

Table 3 Execution cost of DAG tasks by each processor

Tasks 1199011

1199012

1199013

v1

7 8 9v2

12 14 16v3

14 15 16v4

13 3 14

before to promote the intensification capability of DRSCROand avoid the function duplication

The operator IntermoleGO (ie the ineffective collisionoperator) is used to generate new molecules m1015840

1and m1015840

2

from given moleculesm1andm

2 This operator first uses the

steps in OnWallGO to generate m10158401from m

1 and then the

operator generate the other newmoleculem10158402fromm

2in the

similar fashion In the end the operator generates two newmoleculesm1015840

1andm1015840

2fromm

1andm

2as an intensification

searchThe detailed executions are presented in Algorithm 7Figure 7 shows the example of the IntermoleGO in which themolecules correspond to the DAG as shown in Figure 1(a)

45 Illustrative Example Consider the example shown inFigure 1(a) Its edges are labeled with the communicationcosts whereas the execution costs are shown in Table 3

Initially the path (V1 V2 V4 V3) is found based on

the upward rank value of each task in DAG and the firstmolecule m = ((V

1 0 119901

1) (V2 1 1199011) (V4 0 119901

1) and (V

3

1 1199011)) can be obtained Algorithm 2 InitMoleSMS is then

executed to generate the initial population with 10 elements(ie PopSize is set as 10) for super molecule selection phaseThe initial population molecules are operated during theiterations in the super molecule selection phase as presentedin the framework of DRSCRO in Section 41 and the supermolecule SMole = ((V

1 0 1199013) (V2 1 1199011) (V4 0 1199012) and (V

3

1 1199011)) can be obtainedIn the global optimization phase Algorithm 4 InitMole-

GOVNS is then executed to generate (or to update) the initialpopulation after each iteration as presented in Section 41The molecules are operated during the iterations in theglobal optimization phase as presented in the framework ofDRSCRO in Section 41 and the global minimalmakespan =40 is finally obtained for which the corresponding solution(ie molecule) is ((V

1 0 1199012) (V4 1 1199012) (V2 0 1199012) and (V

3 1

1199012))

46 Analysis of DRSCRO As a new metaheuristic strategythe CRO-based methods for DAG scheduling which isproposed very recently have demonstrated the capability forsolving this kind of NP-hard optimization problems By ana-lyzing the framework molecular structure chemical reactionoperators and the operational environment in DRSCRO itcan be shown to some extent that DRSCRO scheme has theadvantage of three points in comparison with other CRO-based algorithms for DAG scheduling

12 Mathematical Problems in Engineering

New molecule 1

New molecule 2

Molecule 1

Molecule 2

(1 0 p2) (4 1 p2) (2 0 p3) (3 0 p2)

(3 0 p2)(4 1 p2)(2 1 p2)

(1 0 p1)

(1 0 p2)

(1 0 p1) (4 1 p1) (2 0 p1)

(4 0 p2)(2 1 p2)

(3 1 p1)

(3 0 p1)

Figure 7 Example of IntermoleGO

First to some degree super molecule in DRSCRO issimilar to InitS in TMSCRO [30] or the ldquoeliterdquo in GA[31] However the ldquoeliterdquo in GA is usually generated fromtwo chromosomes while super molecule is approached byexecuting the first phase of DRSCRO Moreover in compari-son with TMSCRO DRSCRO uses a metaheuristic strategy(CRO) to get a better super molecule It is because asintelligent random-search algorithm CRO used in the phaseof DRSCRO for super molecule selection searches a widerarea of the solution space than CEFT applied in TMSCROwhich narrow the search down to a very small portion ofthe solution space As a result a better super molecule maycontribute to a better global optimum solution and accel-erates convergence Second DAG scheduling problem hastwo complex aspects including task sequence optimizationand processor assignment optimization which lead to avery large and complicated solution space So for a bettercapability of intensification search than other CRO-basedalgorithms for DAG scheduling on heterogeneous systemsDRSCRO applied VNS algorithm as the initialization of theglobal optimization phase which is also as a local searchprocess during the running of DRSCRO and one of theneighborhood structures of VNS is also utilized in theineffective reaction operator Moreover during the runningof DRSCRO the task priority is changeable in our adoptedVNS algorithm and a new model for processor selection isalso utilized in the neighborhood structures for promotingefficiency of VNS different from the VNS proposed in [22]All of three advantages as previously mentioned enhance theability to get better rapidity of convergence and better searchresult in the whole solution space which is demonstrated bythe experimental results in Sections 53 and 54 The time

complexity of DRSCRO is 119874(iter times (2 times |119881| + 4 times |119864| times |119875| +

119889max times 119904119906119887119901119900119901119873119906119898))

5 Experimental Details

In this section the simulation experiment and comparativeevaluation of HEFT DMSCRO TMSCRO and proposedDRSCRO are presented As presented in [27 30] by theoryanalysis and experimental results TMSCRO and DMSCROproved to have better performance than GA therefore ourwork is the further study of CRO-based algorithms for DAGscheduling on heterogeneous systems and for DRSCRO asa metaheuristic algorithm we focus on the performance ofour proposed algorithm itself and the comparison betweenDRSCRO and other similar kinds of algorithms

First two extensive sets of graphs as the test beds forcomparative study are described Next the parameter settingswhich are used in the simulation experiments are pre-sented The results and analysis of the experiment includingmakespan test and convergence rate test are given in the finalpart

51 Test Bed As presented in [27 30] two extensive setsof DAGs real-world application and randomly generatedapplication graphs are considered as the test beds in theexperiments to enhance the comparability of various algo-rithms The first extensive test bed is two real-world problemDAGs molecular dynamics code [38] and Gaussian elimi-nation [8] Molecular dynamics are a computer simulationof physical movements of the molecules and atoms whichare allowed to interact for a period of time in the contextof N-body simulation Molecular dynamics code DAG isshown in Figure 8 Gaussian elimination is used to calculatethe solution for a linear equation system which is appliedsystematically to convert row operations on a set of linearequations to the upper triangular form As shown in Figure 9the total number of tasks in the Gaussian elimination DAGwith the matrix size of 7 is 27 and the largest task numberat the same level is 6 The reason of the utilization of thesetwo application graphs as a test bed is not only to enhancethe comparability of various algorithms but also to showthe function application of our proposed algorithm as anillustrative demonstration without loss of generality Thesecond extensive test bed for comparative study is the DAGsof random graphs A random graph generator presentedin [39] is implemented to generate random graphs in thesimulation experiment It allows the user to generate a varietyof random graphs with different characteristics such as CCRthe amount of calculation of a task the successor numberof a task and the total number of tasks in a random graphIt is also assumed that all tasks and communication linkshave the same computation cost and communication costrespectively

As shown in Figure 2 the next phase criteria and stoppingcriteria of DRSCRO are that the makespan stays unchangedfor 5000 consecutive iterations in the search loop And thestopping criterion of TMSCRO and DMSCRO is that themakespan remains the same for 10000 consecutive iterations

Mathematical Problems in Engineering 13

Entry

Exit

40

38

36

39

37

3433

27 28

2120

35

31 323029

22 23 24 25 26

321

4 5 6 7

131211 14

8 9 10

15 16 17 18 19

0

Figure 8 A molecular dynamics code DAG

52 Parameter Setting In the experiments a parameter hlis set to represent the heterogeneity level as presented inthe first paragraph of Section 3 It complies with the MHMmodel assumption and results in the fact that speeds of acomputing processor are different for different tasks In doingso the heterogeneity level (1 + hl)(1 minus hl) is equal to thebiggest possible ratio of the best processor speed to the worstprocessor speed for each task hl is set as the value tomake theheterogeneity level 2 unless otherwise specified in this paper

The details of parameter setting are shown in Table 4Theparameters 6ndash12 which are the CRO-based algorithms testedin the simulation are set as presented in [25]

53 Makespan Tests The performance of the proposed algo-rithm is compared with two state-of-the-art CRO-basedscheduling algorithms DMSCRO and TMSCRO and aheuristic algorithm HEFT Each makespan value plotted inthe graphs is the average value of a number of independentruns In the first extensive test bed themakespan is averagedover 10 independent runs (HEFT is run only once as adeterministic algorithm) while in the second extensive test

Exit

0

1 2 3 4 5 6

7

8

13

14

18

19

22

23

25

26

24

20 21

15 16 17

9 10 11 12

Entry

Figure 9 A Gaussian elimination DAG for a matrix of size 7

Table 4 Parameter values for simulation experiment

Parameter ValueCCR 01 02 1 2 5Number of processors 4 8 16 32

hl 0333Successor number of a task in a random graph 1 2 3 4

Total number of tasks in a random graph 10 20 50

InitialKE 1000120579 500120599 10Buffer 200KELossRate 02MoleColl 02PopSize 10

bed the makespan is averaged over 30 different randomgraph running instances Moreover to prove the robustness

14 Mathematical Problems in Engineering

0

20

40

60

80

100

120

140

CCR = 10

Mak

espa

n

HEFTDMSCRO

TMSCRODRSCRO

|P| = 4 |P| = 8 |P| = 16 |P| = 32

Figure 10 Average makespan for the molecular dynamics codeCCR = 10

0

20

40

60

80

100

120

140

CCR = 02

Mak

espa

n

HEFTDMSCRO

TMSCRODRSCRO

|P| = 4 |P| = 8 |P| = 16 |P| = 32

Figure 11 Average makespan for Gaussian elimination CCR = 02

of DRSCRO the best final value achieved in all these runsthe worst final value and the related standard deviation orvariance are also presented

531 Real-World Application Graphs Figures 10ndash13 showthe simulation experiment results of DRSCRO DMSCROTMSCRO and HEFT on the real-world application graphsand Tables 4ndash7 list the detail of the experimental results

As shown in Figures 10 and 11 it can be observed thatthe average makespan decreases as the processor numberincreases The results also show that DRSCRO TMSCROand DMSCRO achieve very similar performance which areall metaheuristic methods It is because according to No-Free-Lunch Theorem all well-designed metaheuristic meth-ods have the same performance on searching for optimalsolutions when averaged over all possible fitness functionsThe TMSCRO andDMSCROused in the simulation are well-designed and taken from the literature Therefore it provedthat DRSCRO developed in our work is also well-designed

A close observation of the results in Tables 10 and 11shows that DRSCRO outperforms TMSCRO and DMSCROon average slightly The reason is only because DRSCRO hasbetter capability of intensification search by applying VNS

01 02 1 2 50

50100150200250300350400450

CCR

Mak

espa

n

HEFTDMSCRO

TMSCRODRSCRO

Figure 12 Average makespan for the molecular dynamics code theprocessors number is 16

01 02 1 2 50

50100150200250300350400450

CCR

Mak

espa

n

HEFTDMSCRO

TMSCRODRSCRO

Figure 13 Average makespan for Gaussian elimination the proces-sors number is 8

and the utilization of one of its neighborhood structuresin the ineffective reaction operator as presented in the lastparagraph of Section 46 Therefore the performance of theaverage results obtained by DRSCRO is better than thatobtained by TMSCRO and DMSCRO when the stoppingcriterion is satisfied Moreover DRSCRO TMSCRO andDMSCRO typically outperform HEFT because they search awider area of the solution space as metaheuristic methodswhile the search of HEFT is narrowed down to a very smallerportion by means of the heuristics

Figures 12 and 13 and the results in Tables 7 and 8show the performance of the experimental results of thesefour algorithms with CCR value increasing It can be seenthat the makespan on average increases with the CCR valueincreasing It is because the heterogeneous processors are inthe idle state for longer as a result of the DAGs becomingmore communication-intensive It also can be observed thatDRSCRO TMSCRO and DMSCRO outperform HEFT andthe advantage becomes more significant with the value ofCCR increasing which suggest that heuristic algorithm likeHEFT has less consistent performance in a wide scheduling

Mathematical Problems in Engineering 15

Table 5 Experiment results for the molecular dynamics code CCR = 10

The number ofprocessors

HEFT(averagemakespan)

DMSCRO(averagemakespan)

TMSCRO(averagemakespan)

DRSCRO(averagemakespan)

DRSCRO(best

makespan)

DRSCRO(worst

makespan)

DRSCRO(variance)

4 120086 117624 116993 116759 114365 119589 24878 120074 116633 116007 115775 113402 118582 246616 120031 116101 115478 115247 112885 118041 245532 119993 115476 114856 114529 112182 117306 2440

Table 6 Experiment results for Gaussian elimination graph CCR = 02

The number ofprocessors

HEFT(averagemakespan)

DMSCRO(averagemakespan)

TMSCRO(averagemakespan)

DRSCRO(averagemakespan)

DRSCRO(best

makespan)

DRSCRO(worst

makespan)

DRSCRO(variance)

4 126852 124252 123585 123337 122042 126327 17698 96952 94965 94455 94266 92333 96551 200816 82356 80668 80235 80074 78433 82015 170632 75630 74080 73682 73535 72027 75317 1566

Table 7 Experiment results for the molecular dynamics code the processors number is 16

The value ofCCR

HEFT(averagemakespan)

DMSCRO(averagemakespan)

TMSCRO(averagemakespan)

DRSCRO(averagemakespan)

DRSCRO(best

makespan)

DRSCRO(worst

makespan)

DRSCRO(variance)

01 111782 109491 108903 108685 106457 111320 231502 112034 109737 109148 108930 106697 111571 23201 120031 116101 115478 115247 112885 118041 24552 160760 157465 156619 156306 153102 158532 30345 414581 406082 403902 403095 400878 404805 2125

Table 8 Experiment results for Gaussian elimination graph the processors number is 8

The value ofCCR

HEFT(averagemakespan)

DMSCRO(averagemakespan)

TMSCRO(averagemakespan)

DRSCRO(averagemakespan)

DRSCRO(best

makespan)

DRSCRO(worst

makespan)

DRSCRO(variance)

01 96612 94632 94124 93935 92010 96212 200102 96952 94965 94455 94266 92333 96551 20081 131094 128407 127717 127462 124849 130552 27152 222635 218071 216900 216467 214194 219550 24565 403600 395327 393204 392418 390260 394083 2069

scenario range and metaheuristic algorithm performs moreeffectively for communication-intensive DAGs

532 Randomly Generated Application Graphs As shownin Figures 14ndash16 randomly generated DAGs are used toevaluate the performance of DRSCRO TMSCRO DMSCROand HEFT in these experiments And the details of theseexperimental results are listed in Tables 9ndash11

Figure 14 shows the performance on the experimentalresults of these four algorithms with the processor numberincreasing As shown in Figure 14 DRSCRO always out-performs TMSCRO DMSCRO and HEFT as the numberof processors increases Figure 15 shows that DMSCRO has

better performance than the other three algorithms as thetask number increases The reasons for these are similarto those explained in the third paragraph of Section 531Figure 16 shows the makespan on average with CCR valuesincreasing It can be seem that the averagemakespan increasesrapidly with the increasing of the value of CCR As shown inFigure 16 the makespan on average increases rapidly whenthe value of CCR rises It is the fact that the DAG becomesmore communication-intensive with CCR increasing whichleads to the processors staying in the idle state for longer

54 Convergence Tests In this section the convergenceexperiments are conducted to show the change of makespan

16 Mathematical Problems in Engineering

Table 9 Experiment results for random graphs under different processor numbers task number is 50 and CCR = 02

The number ofprocessors

HEFT(averagemakespan)

DMSCRO(averagemakespan)

TMSCRO(averagemakespan)

DRSCRO(averagemakespan)

DRSCRO(best

makespan)

DRSCRO(worst

makespan)

DRSCRO(variance)

4 149400 146337 145552 145261 142283 148782 30948 119511 117061 116433 116200 113818 119017 247516 119473 117024 116396 116163 113782 118979 247432 119468 115550 114929 114700 112348 117480 2443

Table 10 Experiment results for random graphs under different task numbers processors number is 32 and CCR = 10

The numberof tasks

HEFT(averagemakespan)

DMSCRO(averagemakespan)

TMSCRO(averagemakespan)

DRSCRO(averagemakespan)

DRSCRO(best

makespan)

DRSCRO(worst

makespan)

DRSCRO(variance)

10 714484 706149 699100 690708 688982 693638 202520 1100918 1089685 1080428 1069082 1066410 1071479 261950 1666373 1662543 1649988 1634230 1633415 1636261 1165

Table 11 Experiment results of DRSCRO for the random graph under different CCRs the task number is 50

Value of CCR Processors number is 4 Processors number is 8 Processors number is 16 Processors number is 3201 145093 116068 116058 11459202 145261 116200 116163 1147001 153248 120228 119511 1185342 194401 166014 164246 1622925 508759 498427 492049 487830

0

50

100

150

CCR = 02

Mak

espa

n

HEFTDMSCRO

TMSCRODRSCRO

|P| = 4 |P| = 8 |P| = 16 |P| = 32

Figure 14 Average makespan for random graphs under differentprocessor numbers task number is 50 and CCR = 02

among DRSCRO TMSCRO and DMSCRO The conver-gence traces and significant tests are to further reveal thedifferences between DRSCRO and the other two algorithmsIn these experiments as suggested in [27] the stoppingcriteria of these three algorithms are that the total runningtime reaches a setting value (eg 180 s) Under the consid-eration of comparability the beginning of the time countingof DRSCRO is set as the start of the global optimizationphase processing In the first extensive test bed themakespanis averaged over 10 independent runs while in the secondextensive test bed the makespan is averaged over 30 differentrandom graph running instances

10 20 500

200400600800

10001200140016001800

The number of tasks

Mak

espa

n

HEFTDMSCRO

TMSCRODRSCRO

Figure 15 Average makespan for random graphs under differenttask numbers processors number is 32 and CCR = 10

541 Convergence Trace The convergence traces ofDRSCRO TMSCRO and DMSCRO for processing themolecular dynamics code and Gaussian elimination areplotted in Figures 17 and 18 respectively Figures 19ndash21show the convergence traces when processing the randomlygenerated DAG sets of which each contains 10 20 and50 tasks respectively As shown in Figures 17ndash21 it canbe observed that the convergence traces of these threealgorithms have obvious differences And the DRSCROconverges faster than the other two algorithms in every caseThe reason for the better rate of convergence of DRSCRO is as

Mathematical Problems in Engineering 17

01 02 1 2 50

100

200

300

400

500

600

CCR

Mak

espa

n

P = 32

P = 16P = 8P = 4

Figure 16 Average makespan of DRSCRO for the random graphunder different CCRs the task number is 50

0 1 2 3 4 5 6 7 8 9134

135

136

137

138

139

140

141

Time (Ms)

Mak

espa

n

DMSCROTMSCRO

DRSCRO

times104

Figure 17 Convergence trace for the molecular dynamics codeCCR = 1 and the number of processors is 16

Time (Ms)0 1 2 3 4 5 6 7 8 9

167

168

169

170

171

172

173

174

Mak

espa

n

DMSCROTMSCRO

DRSCRO

times104

Figure 18 Convergence trace for Gaussian elimination CCR = 02and the number of processors is 8

presented in the last paragraph of Section 46 (ie DRSCROtakes the advantage of its double-reaction structure toobtain a better super molecule for accelerating convergence)Even though the VNS algorithm adds the time cost in eachiteration the enhanced optimization capability of DRSCROalso makes it obtain a better coverage rate than TMSCRO

Time (Ms)0 1 2 3 4 5 6 7

505

51

515

52

525

53

535

54

Mak

espa

n

DMSCROTMSCRO

DRSCRO

times104

Figure 19 Convergence trace for the set of the randomly generatedDAGs with 10 tasks

Time (Ms)0 2 4 6 8 10 12 14

65566

66567

67568

68569

69570

705

Mak

espa

n

DMSCROTMSCRO

DRSCRO

times104

Figure 20 Convergence trace for the set of the randomly generatedDAGs with 20 tasks

Time (Ms)0 05 1 15 2 25 3 35 4 45

179

180

181

182

183

184

185

186

Mak

espa

n

DMSCROTMSCRO

DRSCRO

times105

Figure 21 Convergence trace for the set of the randomly generatedDAGs with 50 tasks

and DMSCRO The simulation experimental results showthat DRSCRO converges faster than TMSCRO by 194 onaverage (by 293 in the best case) and faster than DMSCROby 339 on average (by 412 in the best case)

Moreover the statistical analysis based on the averagevalues achieved is also presented in Section 542 to prove

18 Mathematical Problems in Engineering

Table 12 Results of Friedman tests 120572 = 005

Method Algorithm 119901 value Hypothesis

Friedman testDRSCRO DMSCRO 253119864 minus 02 RejectedDRSCRO TMSCRO 253119864 minus 02 Rejected

DRSCRO DMSCRO TMSCRO 670119864 minus 03 Rejected

Table 13 Results of Quade tests 120572 = 005

Method Algorithm 119901 value Hypothesis

Quade testDRSCRO DMSCRO 132119864 minus 02 RejectedDRSCRO TMSCRO 132119864 minus 02 Rejected

DRSCRO DMSCRO TMSCRO 109119864 minus 03 Rejected

that DRSCRO outperforms the other CRO-based algorithmsfor DAG scheduling from a statistical point of view

542 Significant Tests Statistical analysis is necessary for theaverage coverage rates obtained in all cases by DRSCROTMSCRO and DMSCRO which are metaheuristic methodsin order to find significant differences among these resultsNonparametric tests according to the recommendationsin [40] are specifically considered to be used since theexperimental results may present neither normal distributionnor variance homogeneity Therefore the Friedman test andthe Quade test are applied to check whether significantdifferences exist in the performance between these threealgorithms A significance level 120572 = 005 is used in allstatistical tests

Tables 12 and 13 respectively list the test results of theFriedman test and the Quade test which both reject the nullhypothesis of equivalent performance In both of these twotests our proposedDRSCRO is not only compared against allthe algorithms but also compared against the remaining onesas the control methodThe results in Tables 12 and 13 validatethe significant differences 120572 in the performance of DRSCROTMSCRO and DMSCRO

In sum it could be concluded that DRSCRO which is thecontrol algorithm statistically outperforms the other CRO-based DAG scheduling algorithm on coverage rate with asignificant level of 005

6 Discussion

The experimental results of makespan tests show that theperformance of DRSCRO is very similar to the other similarkinds of metaheuristic algorithms because when averagedover all possible fitness functions each well-designed meta-heuristic algorithm has the same performance for searchingoptimal solutions according to No-Free-Lunch TheoremHowever the proposed DRSCRO can achieve better perfor-mance and find good solutions faster than the other similarkinds of metaheuristic algorithms as the experimental resultsof convergence tests and the reason for it as the analysis inthe last paragraph in Section 46 is that DRSCROhas a bettersuper molecule creation bymetaheuristic method and underthe consideration of the optimization of scheduling orderand processor assignment DRSCRO takes the advantages of

VNS algorithm in the global optimization phase to improvethe optimization capability A load balance neighborhoodstructure is also applied in the ineffective reaction operatorfor a better intensification capability The new processorselection model utilized in the neighborhood structures alsopromotes the efficiency of VNS algorithm

7 Conclusion and Future Study

An algorithm named Double-Reaction-Structured CRO(DRSCRO) is developed for DAG scheduling on heteroge-neous systems in this paper DRSCRO includes two reactionphases one for super molecule selection and another forglobal optimizationThe phase of super molecule selection isused to obtain a super molecule by the metaheuristic methodfor better convergence rate different from other CRO-basedalgorithms forDAG scheduling on heterogeneous systems Inaddition to promote the intersection capability of DRSCROthe VNS algorithm which is with a new model for processorselection utilized in the neighborhood structures is used asthe initialization of global optimization phase and the loadbalance neighborhood structure of VNS is also applied inthe ineffective reaction operator The experimental resultsshow that DRSCRO can also achieve a higher speedup thanthe other CRO-based algorithms as far as we know AndDRSCRO algorithm can also obtain better performance onaveragemakespan in some cases

In future work we will analyze the parameter sensitivityof DRSCRO for promoting its activeness Moreover to makethe proposed algorithmmore practical DRSCROwill be alsoextended to aim at twomain objectives such as (1) minimiza-tion of schedule length (time domain) and (2) minimizationof number of used processors (resource domain)

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

This work is financially supported by the National NaturalScience Foundation of China (Grant no 61462073) and

Mathematical Problems in Engineering 19

the National High Technology Research and DevelopmentProgramof China (863 Program) (Grant no 2015AA020107)

References

[1] T Lewis and H Elrewini Introduction to Parallel ComputingPrentice-Hall New York NY USA 1992

[2] Y-K Kwok and I Ahmad ldquoStatic scheduling algorithms forallocating directed task graphs to multiprocessorsrdquo ACM Com-puting Surveys vol 31 no 4 pp 406ndash471 1999

[3] H Topcuoglu S Hariri and M-Y Wu ldquoPerformance-effectiveand low-complexity task scheduling for heterogeneous comput-ingrdquo IEEE Transactions on Parallel and Distributed Systems vol13 no 3 pp 260ndash274 2002

[4] Y-K Kwok and I Ahmad ldquoDynamic critical-path schedulingan effective technique for allocating task graphs to multiproces-sorsrdquo IEEETransactions on Parallel andDistributed Systems vol7 no 5 pp 506ndash521 1996

[5] F Suter F Desprez and H Casanova ldquoFrom heterogeneoustask scheduling to heterogeneous mixed parallel schedulingrdquo inEuro-Par 2004 Parallel Processing 10th International Euro-ParConference Pisa Italy August 31- September 3 2004 Proceed-ings vol 3149 of Lecture Notes in Computer Science pp 230ndash237Springer Berlin Germany 2004

[6] C-Y Lee J J Hwang Y-C Chow and F D Anger ldquoMultipro-cessor scheduling with interprocessor communication delaysrdquoOperations Research Letters vol 7 no 3 pp 141ndash147 1988

[7] M Maheswaran and H J Siegel ldquoA dynamic matching andscheduling algorithm for heterogeneous computing systemsrdquo inProceedings of the 7th Heterogeneous Computing Workshop pp57ndash69 IEEE Computer Society Orlando Fla USAMarch 1998

[8] M-Y Wu and D D Gajski ldquoHypertool a programming aidformessage-passing systemsrdquo IEEE Transactions on Parallel andDistributed Systems vol 1 no 3 pp 330ndash343 1990

[9] H El-Rewini and T G Lewis ldquoScheduling parallel programtasks onto arbitrary target machinesrdquo Journal of Parallel andDistributed Computing vol 9 no 2 pp 138ndash153 1990

[10] G C Sih and E A Lee ldquoCompile-time scheduling heuristic forinterconnection-constrained heterogeneous processor archi-tecturesrdquo IEEE Transactions on Parallel andDistributed Systemsvol 4 no 2 pp 175ndash187 1993

[11] B Kruatrachue and T Lewis ldquoGrain size determination forparallel processingrdquo IEEE Software vol 5 no 1 pp 23ndash32 1988

[12] M A Al-Mouhamed ldquoLower bound on the number of proces-sors and time for scheduling precedence graphs with communi-cation costsrdquo IEEETransactions on Software Engineering vol 16no 12 pp 1390ndash1401 1990

[13] A Tumeo C Pilato F Ferrandi D Sciuto and P L LanzildquoAnt colony optimization for mapping and scheduling in het-erogeneous multiprocessor systemsrdquo in Proceedings of the Inter-national Conference on Embedded Computer Systems Architec-turesModeling and Simulation (SAMOS rsquo08) pp 142ndash149 IEEEComputer Society Samos Greece June 2008

[14] I Ahmad and M K Dhodhi ldquoMultiprocessor scheduling in agenetic paradigmrdquo Parallel Computing vol 22 no 3 pp 395ndash406 1996

[15] M R Bonyadi and M E Moghaddam ldquoA bipartite geneticalgorithm for multi-processor task schedulingrdquo InternationalJournal of Parallel Programming vol 37 no 5 pp 462ndash487 2009

[16] R C Correa A Ferreira and P Rebreyend ldquoScheduling multi-processor tasks with genetic algorithmsrdquo IEEE Transactions onParallel andDistributed Systems vol 10 no 8 pp 825ndash837 1999

[17] E S H Hou N Ansari and H Ren ldquoGenetic algorithm formultiprocessor schedulingrdquo IEEE Transactions on Parallel andDistributed Systems vol 5 no 2 pp 113ndash120 1994

[18] F A Omara and M M Arafa ldquoGenetic algorithms for taskscheduling problemrdquo Journal of Parallel and Distributed Com-puting vol 70 no 1 pp 13ndash22 2010

[19] A Singh M Sevaux and A Rossi ldquoA hybrid grouping geneticalgorithm for multiprocessor schedulingrdquo in ContemporaryComputing Second International Conference IC3 2009 NoidaIndia August 17ndash19 2009 Proceedings vol 40 of Communica-tions in Computer and Information Science pp 1ndash7 SpringerBerlin Germany 2009

[20] A S Wu H Yu S Jin K-C Lin and G Schiavone ldquoAnincremental genetic algorithm approach to multiprocessorschedulingrdquo IEEE Transactions on Parallel and DistributedSystems vol 15 no 9 pp 824ndash834 2004

[21] Y Wen H Xu and J Yang ldquoA heuristic-based hybrid genetic-variable neighborhood search algorithm for task schedulingin heterogeneous multiprocessor systemrdquo Information Sciencesvol 181 no 3 pp 567ndash581 2011

[22] R Shanmugapriya S Padmavathi and S Shalinie ldquoContentionawareness in task scheduling using Tabu searchrdquo in Proceedingsof the IEEE International Advance Computing Conference pp272ndash277 IEEE Computer Society Patiala India March 2009

[23] S C Porto and C C Ribeiro ldquoA tabu search approach totask scheduling on heterogeneous processors under precedenceconstraintsrdquo International Journal of High Speed Computing vol7 no 1 pp 45ndash72 1995

[24] A V Kalashnikov and V A Kostenko ldquoA parallel algorithm ofsimulated annealing for multiprocessor schedulingrdquo Journal ofComputer and Systems Sciences International vol 47 no 3 pp455ndash463 2008

[25] A Y S Lam and V O K Li ldquoChemical-reaction-inspiredmeta-heuristic for optimizationrdquo IEEE Transactions on EvolutionaryComputation vol 14 no 3 pp 381ndash399 2010

[26] P Choudhury R Kumar and P P Chakrabarti ldquoHybridscheduling of dynamic task graphs with selective duplicationfor multiprocessors under memory and time constraintsrdquo IEEETransactions on Parallel and Distributed Systems vol 19 no 7pp 967ndash980 2008

[27] Y Xu K Li LHe andT K Truong ldquoADAG scheduling schemeon heterogeneous computing systems using double molecularstructure-based chemical reaction optimizationrdquo Journal ofParallel andDistributed Computing vol 73 no 9 pp 1306ndash13222013

[28] J Xu and A Lam ldquoChemical reaction optimization for the gridscheduling problemrdquo in Proceedings of the IEEE InternationalConference on Communications pp 1ndash5 IEEE Computer Soci-ety Cape Town South Africa May 2010

[29] B Varghese G McKee and V Alexandrov ldquoCan agent intel-ligence be used to achieve fault tolerant parallel computingsystemsrdquo Parallel Processing Letters vol 21 no 4 pp 379ndash3962011

[30] Y Jiang Z Shao and Y Guo ldquoA DAG scheduling scheme onheterogeneous computing systems using tuple-based chemicalreaction optimizationrdquo Scientific World Journal vol 2014 Arti-cle ID 404375 23 pages 2014

[31] J Xu AY S Lam and V O K Li ldquoStock portfolio selectionusing chemical reaction optimizationrdquo in Proceedings of theInternational Conference on Operations Research and FinancialEngineering pp 458ndash463 WASET Paris France June 2011

20 Mathematical Problems in Engineering

[32] N Mladenovic and P Hansen ldquoVariable neighborhood searchrdquoComputers amp Operations Research vol 24 no 11 pp 1097ndash11001997

[33] K Li X Tang and K Li ldquoEnergy-efficient stochastic taskscheduling on heterogeneous computing systemsrdquo IEEE Trans-actions on Parallel and Distributed Systems vol 25 no 11 pp2867ndash2876 2014

[34] D HWolpert andWGMacready ldquoNo free lunch theorems foroptimizationrdquo IEEE Transactions on Evolutionary Computationvol 1 no 1 pp 67ndash82 1997

[35] M A Khan ldquoScheduling for heterogeneous Systems usingconstrained critical pathsrdquo Parallel Computing vol 38 no 4-5pp 175ndash193 2012

[36] P Hansen N Mladenovic and J M Perez ldquoVariable neigh-borhood search methods and applicationsrdquo 4OR-A QuarterlyJournal of Operations Research vol 6 no 4 pp 319ndash360 2008

[37] F Glover and G Kochenberger Handbook of MetaheuristicsSpringer New York NY USA 2003

[38] S Kim and J Browne ldquoA general approach to mapping of par-allel computation upon multiprocessor architecturesrdquo in Pro-ceedings of the International Conference on Parallel Processingpp 1ndash8 Pennsylvania State University Pennsylvania State Uni-versity Press University Park Pa USA August 1988

[39] V A F Almeida I M M Vasconcelos J N C Arabe and D AMenasce ldquoUsing random task graphs to investigate the potentialbenefits of heterogeneity in parallel systemsrdquo in Proceedingsof the ACMIEEE Conference on Supercomputing pp 683ndash691IEEE Computer Society Minneapolis Minn USA November1992

[40] S Garcıa A Fernandez J Luengo and F Herrera ldquoAdvancednonparametric tests for multiple comparisons in the design ofexperiments in computational intelligence and data miningexperimental analysis of powerrdquo Information Sciences vol 180no 10 pp 2044ndash2064 2010

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 12: Research Article DRSCRO: A Metaheuristic Algorithm for ...downloads.hindawi.com/journals/mpe/2015/396582.pdf · Research Article DRSCRO: A Metaheuristic Algorithm for Task Scheduling

12 Mathematical Problems in Engineering

New molecule 1

New molecule 2

Molecule 1

Molecule 2

(1 0 p2) (4 1 p2) (2 0 p3) (3 0 p2)

(3 0 p2)(4 1 p2)(2 1 p2)

(1 0 p1)

(1 0 p2)

(1 0 p1) (4 1 p1) (2 0 p1)

(4 0 p2)(2 1 p2)

(3 1 p1)

(3 0 p1)

Figure 7 Example of IntermoleGO

First to some degree super molecule in DRSCRO issimilar to InitS in TMSCRO [30] or the ldquoeliterdquo in GA[31] However the ldquoeliterdquo in GA is usually generated fromtwo chromosomes while super molecule is approached byexecuting the first phase of DRSCRO Moreover in compari-son with TMSCRO DRSCRO uses a metaheuristic strategy(CRO) to get a better super molecule It is because asintelligent random-search algorithm CRO used in the phaseof DRSCRO for super molecule selection searches a widerarea of the solution space than CEFT applied in TMSCROwhich narrow the search down to a very small portion ofthe solution space As a result a better super molecule maycontribute to a better global optimum solution and accel-erates convergence Second DAG scheduling problem hastwo complex aspects including task sequence optimizationand processor assignment optimization which lead to avery large and complicated solution space So for a bettercapability of intensification search than other CRO-basedalgorithms for DAG scheduling on heterogeneous systemsDRSCRO applied VNS algorithm as the initialization of theglobal optimization phase which is also as a local searchprocess during the running of DRSCRO and one of theneighborhood structures of VNS is also utilized in theineffective reaction operator Moreover during the runningof DRSCRO the task priority is changeable in our adoptedVNS algorithm and a new model for processor selection isalso utilized in the neighborhood structures for promotingefficiency of VNS different from the VNS proposed in [22]All of three advantages as previously mentioned enhance theability to get better rapidity of convergence and better searchresult in the whole solution space which is demonstrated bythe experimental results in Sections 53 and 54 The time

complexity of DRSCRO is 119874(iter times (2 times |119881| + 4 times |119864| times |119875| +

119889max times 119904119906119887119901119900119901119873119906119898))

5 Experimental Details

In this section the simulation experiment and comparativeevaluation of HEFT DMSCRO TMSCRO and proposedDRSCRO are presented As presented in [27 30] by theoryanalysis and experimental results TMSCRO and DMSCROproved to have better performance than GA therefore ourwork is the further study of CRO-based algorithms for DAGscheduling on heterogeneous systems and for DRSCRO asa metaheuristic algorithm we focus on the performance ofour proposed algorithm itself and the comparison betweenDRSCRO and other similar kinds of algorithms

First two extensive sets of graphs as the test beds forcomparative study are described Next the parameter settingswhich are used in the simulation experiments are pre-sented The results and analysis of the experiment includingmakespan test and convergence rate test are given in the finalpart

51 Test Bed As presented in [27 30] two extensive setsof DAGs real-world application and randomly generatedapplication graphs are considered as the test beds in theexperiments to enhance the comparability of various algo-rithms The first extensive test bed is two real-world problemDAGs molecular dynamics code [38] and Gaussian elimi-nation [8] Molecular dynamics are a computer simulationof physical movements of the molecules and atoms whichare allowed to interact for a period of time in the contextof N-body simulation Molecular dynamics code DAG isshown in Figure 8 Gaussian elimination is used to calculatethe solution for a linear equation system which is appliedsystematically to convert row operations on a set of linearequations to the upper triangular form As shown in Figure 9the total number of tasks in the Gaussian elimination DAGwith the matrix size of 7 is 27 and the largest task numberat the same level is 6 The reason of the utilization of thesetwo application graphs as a test bed is not only to enhancethe comparability of various algorithms but also to showthe function application of our proposed algorithm as anillustrative demonstration without loss of generality Thesecond extensive test bed for comparative study is the DAGsof random graphs A random graph generator presentedin [39] is implemented to generate random graphs in thesimulation experiment It allows the user to generate a varietyof random graphs with different characteristics such as CCRthe amount of calculation of a task the successor numberof a task and the total number of tasks in a random graphIt is also assumed that all tasks and communication linkshave the same computation cost and communication costrespectively

As shown in Figure 2 the next phase criteria and stoppingcriteria of DRSCRO are that the makespan stays unchangedfor 5000 consecutive iterations in the search loop And thestopping criterion of TMSCRO and DMSCRO is that themakespan remains the same for 10000 consecutive iterations

Mathematical Problems in Engineering 13

Entry

Exit

40

38

36

39

37

3433

27 28

2120

35

31 323029

22 23 24 25 26

321

4 5 6 7

131211 14

8 9 10

15 16 17 18 19

0

Figure 8 A molecular dynamics code DAG

52 Parameter Setting In the experiments a parameter hlis set to represent the heterogeneity level as presented inthe first paragraph of Section 3 It complies with the MHMmodel assumption and results in the fact that speeds of acomputing processor are different for different tasks In doingso the heterogeneity level (1 + hl)(1 minus hl) is equal to thebiggest possible ratio of the best processor speed to the worstprocessor speed for each task hl is set as the value tomake theheterogeneity level 2 unless otherwise specified in this paper

The details of parameter setting are shown in Table 4Theparameters 6ndash12 which are the CRO-based algorithms testedin the simulation are set as presented in [25]

53 Makespan Tests The performance of the proposed algo-rithm is compared with two state-of-the-art CRO-basedscheduling algorithms DMSCRO and TMSCRO and aheuristic algorithm HEFT Each makespan value plotted inthe graphs is the average value of a number of independentruns In the first extensive test bed themakespan is averagedover 10 independent runs (HEFT is run only once as adeterministic algorithm) while in the second extensive test

Exit

0

1 2 3 4 5 6

7

8

13

14

18

19

22

23

25

26

24

20 21

15 16 17

9 10 11 12

Entry

Figure 9 A Gaussian elimination DAG for a matrix of size 7

Table 4 Parameter values for simulation experiment

Parameter ValueCCR 01 02 1 2 5Number of processors 4 8 16 32

hl 0333Successor number of a task in a random graph 1 2 3 4

Total number of tasks in a random graph 10 20 50

InitialKE 1000120579 500120599 10Buffer 200KELossRate 02MoleColl 02PopSize 10

bed the makespan is averaged over 30 different randomgraph running instances Moreover to prove the robustness

14 Mathematical Problems in Engineering

0

20

40

60

80

100

120

140

CCR = 10

Mak

espa

n

HEFTDMSCRO

TMSCRODRSCRO

|P| = 4 |P| = 8 |P| = 16 |P| = 32

Figure 10 Average makespan for the molecular dynamics codeCCR = 10

0

20

40

60

80

100

120

140

CCR = 02

Mak

espa

n

HEFTDMSCRO

TMSCRODRSCRO

|P| = 4 |P| = 8 |P| = 16 |P| = 32

Figure 11 Average makespan for Gaussian elimination CCR = 02

of DRSCRO the best final value achieved in all these runsthe worst final value and the related standard deviation orvariance are also presented

531 Real-World Application Graphs Figures 10ndash13 showthe simulation experiment results of DRSCRO DMSCROTMSCRO and HEFT on the real-world application graphsand Tables 4ndash7 list the detail of the experimental results

As shown in Figures 10 and 11 it can be observed thatthe average makespan decreases as the processor numberincreases The results also show that DRSCRO TMSCROand DMSCRO achieve very similar performance which areall metaheuristic methods It is because according to No-Free-Lunch Theorem all well-designed metaheuristic meth-ods have the same performance on searching for optimalsolutions when averaged over all possible fitness functionsThe TMSCRO andDMSCROused in the simulation are well-designed and taken from the literature Therefore it provedthat DRSCRO developed in our work is also well-designed

A close observation of the results in Tables 10 and 11shows that DRSCRO outperforms TMSCRO and DMSCROon average slightly The reason is only because DRSCRO hasbetter capability of intensification search by applying VNS

01 02 1 2 50

50100150200250300350400450

CCR

Mak

espa

n

HEFTDMSCRO

TMSCRODRSCRO

Figure 12 Average makespan for the molecular dynamics code theprocessors number is 16

01 02 1 2 50

50100150200250300350400450

CCR

Mak

espa

n

HEFTDMSCRO

TMSCRODRSCRO

Figure 13 Average makespan for Gaussian elimination the proces-sors number is 8

and the utilization of one of its neighborhood structuresin the ineffective reaction operator as presented in the lastparagraph of Section 46 Therefore the performance of theaverage results obtained by DRSCRO is better than thatobtained by TMSCRO and DMSCRO when the stoppingcriterion is satisfied Moreover DRSCRO TMSCRO andDMSCRO typically outperform HEFT because they search awider area of the solution space as metaheuristic methodswhile the search of HEFT is narrowed down to a very smallerportion by means of the heuristics

Figures 12 and 13 and the results in Tables 7 and 8show the performance of the experimental results of thesefour algorithms with CCR value increasing It can be seenthat the makespan on average increases with the CCR valueincreasing It is because the heterogeneous processors are inthe idle state for longer as a result of the DAGs becomingmore communication-intensive It also can be observed thatDRSCRO TMSCRO and DMSCRO outperform HEFT andthe advantage becomes more significant with the value ofCCR increasing which suggest that heuristic algorithm likeHEFT has less consistent performance in a wide scheduling

Mathematical Problems in Engineering 15

Table 5 Experiment results for the molecular dynamics code CCR = 10

The number ofprocessors

HEFT(averagemakespan)

DMSCRO(averagemakespan)

TMSCRO(averagemakespan)

DRSCRO(averagemakespan)

DRSCRO(best

makespan)

DRSCRO(worst

makespan)

DRSCRO(variance)

4 120086 117624 116993 116759 114365 119589 24878 120074 116633 116007 115775 113402 118582 246616 120031 116101 115478 115247 112885 118041 245532 119993 115476 114856 114529 112182 117306 2440

Table 6 Experiment results for Gaussian elimination graph CCR = 02

The number ofprocessors

HEFT(averagemakespan)

DMSCRO(averagemakespan)

TMSCRO(averagemakespan)

DRSCRO(averagemakespan)

DRSCRO(best

makespan)

DRSCRO(worst

makespan)

DRSCRO(variance)

4 126852 124252 123585 123337 122042 126327 17698 96952 94965 94455 94266 92333 96551 200816 82356 80668 80235 80074 78433 82015 170632 75630 74080 73682 73535 72027 75317 1566

Table 7 Experiment results for the molecular dynamics code the processors number is 16

The value ofCCR

HEFT(averagemakespan)

DMSCRO(averagemakespan)

TMSCRO(averagemakespan)

DRSCRO(averagemakespan)

DRSCRO(best

makespan)

DRSCRO(worst

makespan)

DRSCRO(variance)

01 111782 109491 108903 108685 106457 111320 231502 112034 109737 109148 108930 106697 111571 23201 120031 116101 115478 115247 112885 118041 24552 160760 157465 156619 156306 153102 158532 30345 414581 406082 403902 403095 400878 404805 2125

Table 8 Experiment results for Gaussian elimination graph the processors number is 8

The value ofCCR

HEFT(averagemakespan)

DMSCRO(averagemakespan)

TMSCRO(averagemakespan)

DRSCRO(averagemakespan)

DRSCRO(best

makespan)

DRSCRO(worst

makespan)

DRSCRO(variance)

01 96612 94632 94124 93935 92010 96212 200102 96952 94965 94455 94266 92333 96551 20081 131094 128407 127717 127462 124849 130552 27152 222635 218071 216900 216467 214194 219550 24565 403600 395327 393204 392418 390260 394083 2069

scenario range and metaheuristic algorithm performs moreeffectively for communication-intensive DAGs

532 Randomly Generated Application Graphs As shownin Figures 14ndash16 randomly generated DAGs are used toevaluate the performance of DRSCRO TMSCRO DMSCROand HEFT in these experiments And the details of theseexperimental results are listed in Tables 9ndash11

Figure 14 shows the performance on the experimentalresults of these four algorithms with the processor numberincreasing As shown in Figure 14 DRSCRO always out-performs TMSCRO DMSCRO and HEFT as the numberof processors increases Figure 15 shows that DMSCRO has

better performance than the other three algorithms as thetask number increases The reasons for these are similarto those explained in the third paragraph of Section 531Figure 16 shows the makespan on average with CCR valuesincreasing It can be seem that the averagemakespan increasesrapidly with the increasing of the value of CCR As shown inFigure 16 the makespan on average increases rapidly whenthe value of CCR rises It is the fact that the DAG becomesmore communication-intensive with CCR increasing whichleads to the processors staying in the idle state for longer

54 Convergence Tests In this section the convergenceexperiments are conducted to show the change of makespan

16 Mathematical Problems in Engineering

Table 9 Experiment results for random graphs under different processor numbers task number is 50 and CCR = 02

The number ofprocessors

HEFT(averagemakespan)

DMSCRO(averagemakespan)

TMSCRO(averagemakespan)

DRSCRO(averagemakespan)

DRSCRO(best

makespan)

DRSCRO(worst

makespan)

DRSCRO(variance)

4 149400 146337 145552 145261 142283 148782 30948 119511 117061 116433 116200 113818 119017 247516 119473 117024 116396 116163 113782 118979 247432 119468 115550 114929 114700 112348 117480 2443

Table 10 Experiment results for random graphs under different task numbers processors number is 32 and CCR = 10

The numberof tasks

HEFT(averagemakespan)

DMSCRO(averagemakespan)

TMSCRO(averagemakespan)

DRSCRO(averagemakespan)

DRSCRO(best

makespan)

DRSCRO(worst

makespan)

DRSCRO(variance)

10 714484 706149 699100 690708 688982 693638 202520 1100918 1089685 1080428 1069082 1066410 1071479 261950 1666373 1662543 1649988 1634230 1633415 1636261 1165

Table 11 Experiment results of DRSCRO for the random graph under different CCRs the task number is 50

Value of CCR Processors number is 4 Processors number is 8 Processors number is 16 Processors number is 3201 145093 116068 116058 11459202 145261 116200 116163 1147001 153248 120228 119511 1185342 194401 166014 164246 1622925 508759 498427 492049 487830

0

50

100

150

CCR = 02

Mak

espa

n

HEFTDMSCRO

TMSCRODRSCRO

|P| = 4 |P| = 8 |P| = 16 |P| = 32

Figure 14 Average makespan for random graphs under differentprocessor numbers task number is 50 and CCR = 02

among DRSCRO TMSCRO and DMSCRO The conver-gence traces and significant tests are to further reveal thedifferences between DRSCRO and the other two algorithmsIn these experiments as suggested in [27] the stoppingcriteria of these three algorithms are that the total runningtime reaches a setting value (eg 180 s) Under the consid-eration of comparability the beginning of the time countingof DRSCRO is set as the start of the global optimizationphase processing In the first extensive test bed themakespanis averaged over 10 independent runs while in the secondextensive test bed the makespan is averaged over 30 differentrandom graph running instances

10 20 500

200400600800

10001200140016001800

The number of tasks

Mak

espa

n

HEFTDMSCRO

TMSCRODRSCRO

Figure 15 Average makespan for random graphs under differenttask numbers processors number is 32 and CCR = 10

541 Convergence Trace The convergence traces ofDRSCRO TMSCRO and DMSCRO for processing themolecular dynamics code and Gaussian elimination areplotted in Figures 17 and 18 respectively Figures 19ndash21show the convergence traces when processing the randomlygenerated DAG sets of which each contains 10 20 and50 tasks respectively As shown in Figures 17ndash21 it canbe observed that the convergence traces of these threealgorithms have obvious differences And the DRSCROconverges faster than the other two algorithms in every caseThe reason for the better rate of convergence of DRSCRO is as

Mathematical Problems in Engineering 17

01 02 1 2 50

100

200

300

400

500

600

CCR

Mak

espa

n

P = 32

P = 16P = 8P = 4

Figure 16 Average makespan of DRSCRO for the random graphunder different CCRs the task number is 50

0 1 2 3 4 5 6 7 8 9134

135

136

137

138

139

140

141

Time (Ms)

Mak

espa

n

DMSCROTMSCRO

DRSCRO

times104

Figure 17 Convergence trace for the molecular dynamics codeCCR = 1 and the number of processors is 16

Time (Ms)0 1 2 3 4 5 6 7 8 9

167

168

169

170

171

172

173

174

Mak

espa

n

DMSCROTMSCRO

DRSCRO

times104

Figure 18 Convergence trace for Gaussian elimination CCR = 02and the number of processors is 8

presented in the last paragraph of Section 46 (ie DRSCROtakes the advantage of its double-reaction structure toobtain a better super molecule for accelerating convergence)Even though the VNS algorithm adds the time cost in eachiteration the enhanced optimization capability of DRSCROalso makes it obtain a better coverage rate than TMSCRO

Time (Ms)0 1 2 3 4 5 6 7

505

51

515

52

525

53

535

54

Mak

espa

n

DMSCROTMSCRO

DRSCRO

times104

Figure 19 Convergence trace for the set of the randomly generatedDAGs with 10 tasks

Time (Ms)0 2 4 6 8 10 12 14

65566

66567

67568

68569

69570

705

Mak

espa

n

DMSCROTMSCRO

DRSCRO

times104

Figure 20 Convergence trace for the set of the randomly generatedDAGs with 20 tasks

Time (Ms)0 05 1 15 2 25 3 35 4 45

179

180

181

182

183

184

185

186

Mak

espa

n

DMSCROTMSCRO

DRSCRO

times105

Figure 21 Convergence trace for the set of the randomly generatedDAGs with 50 tasks

and DMSCRO The simulation experimental results showthat DRSCRO converges faster than TMSCRO by 194 onaverage (by 293 in the best case) and faster than DMSCROby 339 on average (by 412 in the best case)

Moreover the statistical analysis based on the averagevalues achieved is also presented in Section 542 to prove

18 Mathematical Problems in Engineering

Table 12 Results of Friedman tests 120572 = 005

Method Algorithm 119901 value Hypothesis

Friedman testDRSCRO DMSCRO 253119864 minus 02 RejectedDRSCRO TMSCRO 253119864 minus 02 Rejected

DRSCRO DMSCRO TMSCRO 670119864 minus 03 Rejected

Table 13 Results of Quade tests 120572 = 005

Method Algorithm 119901 value Hypothesis

Quade testDRSCRO DMSCRO 132119864 minus 02 RejectedDRSCRO TMSCRO 132119864 minus 02 Rejected

DRSCRO DMSCRO TMSCRO 109119864 minus 03 Rejected

that DRSCRO outperforms the other CRO-based algorithmsfor DAG scheduling from a statistical point of view

542 Significant Tests Statistical analysis is necessary for theaverage coverage rates obtained in all cases by DRSCROTMSCRO and DMSCRO which are metaheuristic methodsin order to find significant differences among these resultsNonparametric tests according to the recommendationsin [40] are specifically considered to be used since theexperimental results may present neither normal distributionnor variance homogeneity Therefore the Friedman test andthe Quade test are applied to check whether significantdifferences exist in the performance between these threealgorithms A significance level 120572 = 005 is used in allstatistical tests

Tables 12 and 13 respectively list the test results of theFriedman test and the Quade test which both reject the nullhypothesis of equivalent performance In both of these twotests our proposedDRSCRO is not only compared against allthe algorithms but also compared against the remaining onesas the control methodThe results in Tables 12 and 13 validatethe significant differences 120572 in the performance of DRSCROTMSCRO and DMSCRO

In sum it could be concluded that DRSCRO which is thecontrol algorithm statistically outperforms the other CRO-based DAG scheduling algorithm on coverage rate with asignificant level of 005

6 Discussion

The experimental results of makespan tests show that theperformance of DRSCRO is very similar to the other similarkinds of metaheuristic algorithms because when averagedover all possible fitness functions each well-designed meta-heuristic algorithm has the same performance for searchingoptimal solutions according to No-Free-Lunch TheoremHowever the proposed DRSCRO can achieve better perfor-mance and find good solutions faster than the other similarkinds of metaheuristic algorithms as the experimental resultsof convergence tests and the reason for it as the analysis inthe last paragraph in Section 46 is that DRSCROhas a bettersuper molecule creation bymetaheuristic method and underthe consideration of the optimization of scheduling orderand processor assignment DRSCRO takes the advantages of

VNS algorithm in the global optimization phase to improvethe optimization capability A load balance neighborhoodstructure is also applied in the ineffective reaction operatorfor a better intensification capability The new processorselection model utilized in the neighborhood structures alsopromotes the efficiency of VNS algorithm

7 Conclusion and Future Study

An algorithm named Double-Reaction-Structured CRO(DRSCRO) is developed for DAG scheduling on heteroge-neous systems in this paper DRSCRO includes two reactionphases one for super molecule selection and another forglobal optimizationThe phase of super molecule selection isused to obtain a super molecule by the metaheuristic methodfor better convergence rate different from other CRO-basedalgorithms forDAG scheduling on heterogeneous systems Inaddition to promote the intersection capability of DRSCROthe VNS algorithm which is with a new model for processorselection utilized in the neighborhood structures is used asthe initialization of global optimization phase and the loadbalance neighborhood structure of VNS is also applied inthe ineffective reaction operator The experimental resultsshow that DRSCRO can also achieve a higher speedup thanthe other CRO-based algorithms as far as we know AndDRSCRO algorithm can also obtain better performance onaveragemakespan in some cases

In future work we will analyze the parameter sensitivityof DRSCRO for promoting its activeness Moreover to makethe proposed algorithmmore practical DRSCROwill be alsoextended to aim at twomain objectives such as (1) minimiza-tion of schedule length (time domain) and (2) minimizationof number of used processors (resource domain)

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

This work is financially supported by the National NaturalScience Foundation of China (Grant no 61462073) and

Mathematical Problems in Engineering 19

the National High Technology Research and DevelopmentProgramof China (863 Program) (Grant no 2015AA020107)

References

[1] T Lewis and H Elrewini Introduction to Parallel ComputingPrentice-Hall New York NY USA 1992

[2] Y-K Kwok and I Ahmad ldquoStatic scheduling algorithms forallocating directed task graphs to multiprocessorsrdquo ACM Com-puting Surveys vol 31 no 4 pp 406ndash471 1999

[3] H Topcuoglu S Hariri and M-Y Wu ldquoPerformance-effectiveand low-complexity task scheduling for heterogeneous comput-ingrdquo IEEE Transactions on Parallel and Distributed Systems vol13 no 3 pp 260ndash274 2002

[4] Y-K Kwok and I Ahmad ldquoDynamic critical-path schedulingan effective technique for allocating task graphs to multiproces-sorsrdquo IEEETransactions on Parallel andDistributed Systems vol7 no 5 pp 506ndash521 1996

[5] F Suter F Desprez and H Casanova ldquoFrom heterogeneoustask scheduling to heterogeneous mixed parallel schedulingrdquo inEuro-Par 2004 Parallel Processing 10th International Euro-ParConference Pisa Italy August 31- September 3 2004 Proceed-ings vol 3149 of Lecture Notes in Computer Science pp 230ndash237Springer Berlin Germany 2004

[6] C-Y Lee J J Hwang Y-C Chow and F D Anger ldquoMultipro-cessor scheduling with interprocessor communication delaysrdquoOperations Research Letters vol 7 no 3 pp 141ndash147 1988

[7] M Maheswaran and H J Siegel ldquoA dynamic matching andscheduling algorithm for heterogeneous computing systemsrdquo inProceedings of the 7th Heterogeneous Computing Workshop pp57ndash69 IEEE Computer Society Orlando Fla USAMarch 1998

[8] M-Y Wu and D D Gajski ldquoHypertool a programming aidformessage-passing systemsrdquo IEEE Transactions on Parallel andDistributed Systems vol 1 no 3 pp 330ndash343 1990

[9] H El-Rewini and T G Lewis ldquoScheduling parallel programtasks onto arbitrary target machinesrdquo Journal of Parallel andDistributed Computing vol 9 no 2 pp 138ndash153 1990

[10] G C Sih and E A Lee ldquoCompile-time scheduling heuristic forinterconnection-constrained heterogeneous processor archi-tecturesrdquo IEEE Transactions on Parallel andDistributed Systemsvol 4 no 2 pp 175ndash187 1993

[11] B Kruatrachue and T Lewis ldquoGrain size determination forparallel processingrdquo IEEE Software vol 5 no 1 pp 23ndash32 1988

[12] M A Al-Mouhamed ldquoLower bound on the number of proces-sors and time for scheduling precedence graphs with communi-cation costsrdquo IEEETransactions on Software Engineering vol 16no 12 pp 1390ndash1401 1990

[13] A Tumeo C Pilato F Ferrandi D Sciuto and P L LanzildquoAnt colony optimization for mapping and scheduling in het-erogeneous multiprocessor systemsrdquo in Proceedings of the Inter-national Conference on Embedded Computer Systems Architec-turesModeling and Simulation (SAMOS rsquo08) pp 142ndash149 IEEEComputer Society Samos Greece June 2008

[14] I Ahmad and M K Dhodhi ldquoMultiprocessor scheduling in agenetic paradigmrdquo Parallel Computing vol 22 no 3 pp 395ndash406 1996

[15] M R Bonyadi and M E Moghaddam ldquoA bipartite geneticalgorithm for multi-processor task schedulingrdquo InternationalJournal of Parallel Programming vol 37 no 5 pp 462ndash487 2009

[16] R C Correa A Ferreira and P Rebreyend ldquoScheduling multi-processor tasks with genetic algorithmsrdquo IEEE Transactions onParallel andDistributed Systems vol 10 no 8 pp 825ndash837 1999

[17] E S H Hou N Ansari and H Ren ldquoGenetic algorithm formultiprocessor schedulingrdquo IEEE Transactions on Parallel andDistributed Systems vol 5 no 2 pp 113ndash120 1994

[18] F A Omara and M M Arafa ldquoGenetic algorithms for taskscheduling problemrdquo Journal of Parallel and Distributed Com-puting vol 70 no 1 pp 13ndash22 2010

[19] A Singh M Sevaux and A Rossi ldquoA hybrid grouping geneticalgorithm for multiprocessor schedulingrdquo in ContemporaryComputing Second International Conference IC3 2009 NoidaIndia August 17ndash19 2009 Proceedings vol 40 of Communica-tions in Computer and Information Science pp 1ndash7 SpringerBerlin Germany 2009

[20] A S Wu H Yu S Jin K-C Lin and G Schiavone ldquoAnincremental genetic algorithm approach to multiprocessorschedulingrdquo IEEE Transactions on Parallel and DistributedSystems vol 15 no 9 pp 824ndash834 2004

[21] Y Wen H Xu and J Yang ldquoA heuristic-based hybrid genetic-variable neighborhood search algorithm for task schedulingin heterogeneous multiprocessor systemrdquo Information Sciencesvol 181 no 3 pp 567ndash581 2011

[22] R Shanmugapriya S Padmavathi and S Shalinie ldquoContentionawareness in task scheduling using Tabu searchrdquo in Proceedingsof the IEEE International Advance Computing Conference pp272ndash277 IEEE Computer Society Patiala India March 2009

[23] S C Porto and C C Ribeiro ldquoA tabu search approach totask scheduling on heterogeneous processors under precedenceconstraintsrdquo International Journal of High Speed Computing vol7 no 1 pp 45ndash72 1995

[24] A V Kalashnikov and V A Kostenko ldquoA parallel algorithm ofsimulated annealing for multiprocessor schedulingrdquo Journal ofComputer and Systems Sciences International vol 47 no 3 pp455ndash463 2008

[25] A Y S Lam and V O K Li ldquoChemical-reaction-inspiredmeta-heuristic for optimizationrdquo IEEE Transactions on EvolutionaryComputation vol 14 no 3 pp 381ndash399 2010

[26] P Choudhury R Kumar and P P Chakrabarti ldquoHybridscheduling of dynamic task graphs with selective duplicationfor multiprocessors under memory and time constraintsrdquo IEEETransactions on Parallel and Distributed Systems vol 19 no 7pp 967ndash980 2008

[27] Y Xu K Li LHe andT K Truong ldquoADAG scheduling schemeon heterogeneous computing systems using double molecularstructure-based chemical reaction optimizationrdquo Journal ofParallel andDistributed Computing vol 73 no 9 pp 1306ndash13222013

[28] J Xu and A Lam ldquoChemical reaction optimization for the gridscheduling problemrdquo in Proceedings of the IEEE InternationalConference on Communications pp 1ndash5 IEEE Computer Soci-ety Cape Town South Africa May 2010

[29] B Varghese G McKee and V Alexandrov ldquoCan agent intel-ligence be used to achieve fault tolerant parallel computingsystemsrdquo Parallel Processing Letters vol 21 no 4 pp 379ndash3962011

[30] Y Jiang Z Shao and Y Guo ldquoA DAG scheduling scheme onheterogeneous computing systems using tuple-based chemicalreaction optimizationrdquo Scientific World Journal vol 2014 Arti-cle ID 404375 23 pages 2014

[31] J Xu AY S Lam and V O K Li ldquoStock portfolio selectionusing chemical reaction optimizationrdquo in Proceedings of theInternational Conference on Operations Research and FinancialEngineering pp 458ndash463 WASET Paris France June 2011

20 Mathematical Problems in Engineering

[32] N Mladenovic and P Hansen ldquoVariable neighborhood searchrdquoComputers amp Operations Research vol 24 no 11 pp 1097ndash11001997

[33] K Li X Tang and K Li ldquoEnergy-efficient stochastic taskscheduling on heterogeneous computing systemsrdquo IEEE Trans-actions on Parallel and Distributed Systems vol 25 no 11 pp2867ndash2876 2014

[34] D HWolpert andWGMacready ldquoNo free lunch theorems foroptimizationrdquo IEEE Transactions on Evolutionary Computationvol 1 no 1 pp 67ndash82 1997

[35] M A Khan ldquoScheduling for heterogeneous Systems usingconstrained critical pathsrdquo Parallel Computing vol 38 no 4-5pp 175ndash193 2012

[36] P Hansen N Mladenovic and J M Perez ldquoVariable neigh-borhood search methods and applicationsrdquo 4OR-A QuarterlyJournal of Operations Research vol 6 no 4 pp 319ndash360 2008

[37] F Glover and G Kochenberger Handbook of MetaheuristicsSpringer New York NY USA 2003

[38] S Kim and J Browne ldquoA general approach to mapping of par-allel computation upon multiprocessor architecturesrdquo in Pro-ceedings of the International Conference on Parallel Processingpp 1ndash8 Pennsylvania State University Pennsylvania State Uni-versity Press University Park Pa USA August 1988

[39] V A F Almeida I M M Vasconcelos J N C Arabe and D AMenasce ldquoUsing random task graphs to investigate the potentialbenefits of heterogeneity in parallel systemsrdquo in Proceedingsof the ACMIEEE Conference on Supercomputing pp 683ndash691IEEE Computer Society Minneapolis Minn USA November1992

[40] S Garcıa A Fernandez J Luengo and F Herrera ldquoAdvancednonparametric tests for multiple comparisons in the design ofexperiments in computational intelligence and data miningexperimental analysis of powerrdquo Information Sciences vol 180no 10 pp 2044ndash2064 2010

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 13: Research Article DRSCRO: A Metaheuristic Algorithm for ...downloads.hindawi.com/journals/mpe/2015/396582.pdf · Research Article DRSCRO: A Metaheuristic Algorithm for Task Scheduling

Mathematical Problems in Engineering 13

Entry

Exit

40

38

36

39

37

3433

27 28

2120

35

31 323029

22 23 24 25 26

321

4 5 6 7

131211 14

8 9 10

15 16 17 18 19

0

Figure 8 A molecular dynamics code DAG

52 Parameter Setting In the experiments a parameter hlis set to represent the heterogeneity level as presented inthe first paragraph of Section 3 It complies with the MHMmodel assumption and results in the fact that speeds of acomputing processor are different for different tasks In doingso the heterogeneity level (1 + hl)(1 minus hl) is equal to thebiggest possible ratio of the best processor speed to the worstprocessor speed for each task hl is set as the value tomake theheterogeneity level 2 unless otherwise specified in this paper

The details of parameter setting are shown in Table 4Theparameters 6ndash12 which are the CRO-based algorithms testedin the simulation are set as presented in [25]

53 Makespan Tests The performance of the proposed algo-rithm is compared with two state-of-the-art CRO-basedscheduling algorithms DMSCRO and TMSCRO and aheuristic algorithm HEFT Each makespan value plotted inthe graphs is the average value of a number of independentruns In the first extensive test bed themakespan is averagedover 10 independent runs (HEFT is run only once as adeterministic algorithm) while in the second extensive test

Exit

0

1 2 3 4 5 6

7

8

13

14

18

19

22

23

25

26

24

20 21

15 16 17

9 10 11 12

Entry

Figure 9 A Gaussian elimination DAG for a matrix of size 7

Table 4 Parameter values for simulation experiment

Parameter ValueCCR 01 02 1 2 5Number of processors 4 8 16 32

hl 0333Successor number of a task in a random graph 1 2 3 4

Total number of tasks in a random graph 10 20 50

InitialKE 1000120579 500120599 10Buffer 200KELossRate 02MoleColl 02PopSize 10

bed the makespan is averaged over 30 different randomgraph running instances Moreover to prove the robustness

14 Mathematical Problems in Engineering

0

20

40

60

80

100

120

140

CCR = 10

Mak

espa

n

HEFTDMSCRO

TMSCRODRSCRO

|P| = 4 |P| = 8 |P| = 16 |P| = 32

Figure 10 Average makespan for the molecular dynamics codeCCR = 10

0

20

40

60

80

100

120

140

CCR = 02

Mak

espa

n

HEFTDMSCRO

TMSCRODRSCRO

|P| = 4 |P| = 8 |P| = 16 |P| = 32

Figure 11 Average makespan for Gaussian elimination CCR = 02

of DRSCRO the best final value achieved in all these runsthe worst final value and the related standard deviation orvariance are also presented

531 Real-World Application Graphs Figures 10ndash13 showthe simulation experiment results of DRSCRO DMSCROTMSCRO and HEFT on the real-world application graphsand Tables 4ndash7 list the detail of the experimental results

As shown in Figures 10 and 11 it can be observed thatthe average makespan decreases as the processor numberincreases The results also show that DRSCRO TMSCROand DMSCRO achieve very similar performance which areall metaheuristic methods It is because according to No-Free-Lunch Theorem all well-designed metaheuristic meth-ods have the same performance on searching for optimalsolutions when averaged over all possible fitness functionsThe TMSCRO andDMSCROused in the simulation are well-designed and taken from the literature Therefore it provedthat DRSCRO developed in our work is also well-designed

A close observation of the results in Tables 10 and 11shows that DRSCRO outperforms TMSCRO and DMSCROon average slightly The reason is only because DRSCRO hasbetter capability of intensification search by applying VNS

01 02 1 2 50

50100150200250300350400450

CCR

Mak

espa

n

HEFTDMSCRO

TMSCRODRSCRO

Figure 12 Average makespan for the molecular dynamics code theprocessors number is 16

01 02 1 2 50

50100150200250300350400450

CCR

Mak

espa

n

HEFTDMSCRO

TMSCRODRSCRO

Figure 13 Average makespan for Gaussian elimination the proces-sors number is 8

and the utilization of one of its neighborhood structuresin the ineffective reaction operator as presented in the lastparagraph of Section 46 Therefore the performance of theaverage results obtained by DRSCRO is better than thatobtained by TMSCRO and DMSCRO when the stoppingcriterion is satisfied Moreover DRSCRO TMSCRO andDMSCRO typically outperform HEFT because they search awider area of the solution space as metaheuristic methodswhile the search of HEFT is narrowed down to a very smallerportion by means of the heuristics

Figures 12 and 13 and the results in Tables 7 and 8show the performance of the experimental results of thesefour algorithms with CCR value increasing It can be seenthat the makespan on average increases with the CCR valueincreasing It is because the heterogeneous processors are inthe idle state for longer as a result of the DAGs becomingmore communication-intensive It also can be observed thatDRSCRO TMSCRO and DMSCRO outperform HEFT andthe advantage becomes more significant with the value ofCCR increasing which suggest that heuristic algorithm likeHEFT has less consistent performance in a wide scheduling

Mathematical Problems in Engineering 15

Table 5 Experiment results for the molecular dynamics code CCR = 10

The number ofprocessors

HEFT(averagemakespan)

DMSCRO(averagemakespan)

TMSCRO(averagemakespan)

DRSCRO(averagemakespan)

DRSCRO(best

makespan)

DRSCRO(worst

makespan)

DRSCRO(variance)

4 120086 117624 116993 116759 114365 119589 24878 120074 116633 116007 115775 113402 118582 246616 120031 116101 115478 115247 112885 118041 245532 119993 115476 114856 114529 112182 117306 2440

Table 6 Experiment results for Gaussian elimination graph CCR = 02

The number ofprocessors

HEFT(averagemakespan)

DMSCRO(averagemakespan)

TMSCRO(averagemakespan)

DRSCRO(averagemakespan)

DRSCRO(best

makespan)

DRSCRO(worst

makespan)

DRSCRO(variance)

4 126852 124252 123585 123337 122042 126327 17698 96952 94965 94455 94266 92333 96551 200816 82356 80668 80235 80074 78433 82015 170632 75630 74080 73682 73535 72027 75317 1566

Table 7 Experiment results for the molecular dynamics code the processors number is 16

The value ofCCR

HEFT(averagemakespan)

DMSCRO(averagemakespan)

TMSCRO(averagemakespan)

DRSCRO(averagemakespan)

DRSCRO(best

makespan)

DRSCRO(worst

makespan)

DRSCRO(variance)

01 111782 109491 108903 108685 106457 111320 231502 112034 109737 109148 108930 106697 111571 23201 120031 116101 115478 115247 112885 118041 24552 160760 157465 156619 156306 153102 158532 30345 414581 406082 403902 403095 400878 404805 2125

Table 8 Experiment results for Gaussian elimination graph the processors number is 8

The value ofCCR

HEFT(averagemakespan)

DMSCRO(averagemakespan)

TMSCRO(averagemakespan)

DRSCRO(averagemakespan)

DRSCRO(best

makespan)

DRSCRO(worst

makespan)

DRSCRO(variance)

01 96612 94632 94124 93935 92010 96212 200102 96952 94965 94455 94266 92333 96551 20081 131094 128407 127717 127462 124849 130552 27152 222635 218071 216900 216467 214194 219550 24565 403600 395327 393204 392418 390260 394083 2069

scenario range and metaheuristic algorithm performs moreeffectively for communication-intensive DAGs

532 Randomly Generated Application Graphs As shownin Figures 14ndash16 randomly generated DAGs are used toevaluate the performance of DRSCRO TMSCRO DMSCROand HEFT in these experiments And the details of theseexperimental results are listed in Tables 9ndash11

Figure 14 shows the performance on the experimentalresults of these four algorithms with the processor numberincreasing As shown in Figure 14 DRSCRO always out-performs TMSCRO DMSCRO and HEFT as the numberof processors increases Figure 15 shows that DMSCRO has

better performance than the other three algorithms as thetask number increases The reasons for these are similarto those explained in the third paragraph of Section 531Figure 16 shows the makespan on average with CCR valuesincreasing It can be seem that the averagemakespan increasesrapidly with the increasing of the value of CCR As shown inFigure 16 the makespan on average increases rapidly whenthe value of CCR rises It is the fact that the DAG becomesmore communication-intensive with CCR increasing whichleads to the processors staying in the idle state for longer

54 Convergence Tests In this section the convergenceexperiments are conducted to show the change of makespan

16 Mathematical Problems in Engineering

Table 9 Experiment results for random graphs under different processor numbers task number is 50 and CCR = 02

The number ofprocessors

HEFT(averagemakespan)

DMSCRO(averagemakespan)

TMSCRO(averagemakespan)

DRSCRO(averagemakespan)

DRSCRO(best

makespan)

DRSCRO(worst

makespan)

DRSCRO(variance)

4 149400 146337 145552 145261 142283 148782 30948 119511 117061 116433 116200 113818 119017 247516 119473 117024 116396 116163 113782 118979 247432 119468 115550 114929 114700 112348 117480 2443

Table 10 Experiment results for random graphs under different task numbers processors number is 32 and CCR = 10

The numberof tasks

HEFT(averagemakespan)

DMSCRO(averagemakespan)

TMSCRO(averagemakespan)

DRSCRO(averagemakespan)

DRSCRO(best

makespan)

DRSCRO(worst

makespan)

DRSCRO(variance)

10 714484 706149 699100 690708 688982 693638 202520 1100918 1089685 1080428 1069082 1066410 1071479 261950 1666373 1662543 1649988 1634230 1633415 1636261 1165

Table 11 Experiment results of DRSCRO for the random graph under different CCRs the task number is 50

Value of CCR Processors number is 4 Processors number is 8 Processors number is 16 Processors number is 3201 145093 116068 116058 11459202 145261 116200 116163 1147001 153248 120228 119511 1185342 194401 166014 164246 1622925 508759 498427 492049 487830

0

50

100

150

CCR = 02

Mak

espa

n

HEFTDMSCRO

TMSCRODRSCRO

|P| = 4 |P| = 8 |P| = 16 |P| = 32

Figure 14 Average makespan for random graphs under differentprocessor numbers task number is 50 and CCR = 02

among DRSCRO TMSCRO and DMSCRO The conver-gence traces and significant tests are to further reveal thedifferences between DRSCRO and the other two algorithmsIn these experiments as suggested in [27] the stoppingcriteria of these three algorithms are that the total runningtime reaches a setting value (eg 180 s) Under the consid-eration of comparability the beginning of the time countingof DRSCRO is set as the start of the global optimizationphase processing In the first extensive test bed themakespanis averaged over 10 independent runs while in the secondextensive test bed the makespan is averaged over 30 differentrandom graph running instances

10 20 500

200400600800

10001200140016001800

The number of tasks

Mak

espa

n

HEFTDMSCRO

TMSCRODRSCRO

Figure 15 Average makespan for random graphs under differenttask numbers processors number is 32 and CCR = 10

541 Convergence Trace The convergence traces ofDRSCRO TMSCRO and DMSCRO for processing themolecular dynamics code and Gaussian elimination areplotted in Figures 17 and 18 respectively Figures 19ndash21show the convergence traces when processing the randomlygenerated DAG sets of which each contains 10 20 and50 tasks respectively As shown in Figures 17ndash21 it canbe observed that the convergence traces of these threealgorithms have obvious differences And the DRSCROconverges faster than the other two algorithms in every caseThe reason for the better rate of convergence of DRSCRO is as

Mathematical Problems in Engineering 17

01 02 1 2 50

100

200

300

400

500

600

CCR

Mak

espa

n

P = 32

P = 16P = 8P = 4

Figure 16 Average makespan of DRSCRO for the random graphunder different CCRs the task number is 50

0 1 2 3 4 5 6 7 8 9134

135

136

137

138

139

140

141

Time (Ms)

Mak

espa

n

DMSCROTMSCRO

DRSCRO

times104

Figure 17 Convergence trace for the molecular dynamics codeCCR = 1 and the number of processors is 16

Time (Ms)0 1 2 3 4 5 6 7 8 9

167

168

169

170

171

172

173

174

Mak

espa

n

DMSCROTMSCRO

DRSCRO

times104

Figure 18 Convergence trace for Gaussian elimination CCR = 02and the number of processors is 8

presented in the last paragraph of Section 46 (ie DRSCROtakes the advantage of its double-reaction structure toobtain a better super molecule for accelerating convergence)Even though the VNS algorithm adds the time cost in eachiteration the enhanced optimization capability of DRSCROalso makes it obtain a better coverage rate than TMSCRO

Time (Ms)0 1 2 3 4 5 6 7

505

51

515

52

525

53

535

54

Mak

espa

n

DMSCROTMSCRO

DRSCRO

times104

Figure 19 Convergence trace for the set of the randomly generatedDAGs with 10 tasks

Time (Ms)0 2 4 6 8 10 12 14

65566

66567

67568

68569

69570

705

Mak

espa

n

DMSCROTMSCRO

DRSCRO

times104

Figure 20 Convergence trace for the set of the randomly generatedDAGs with 20 tasks

Time (Ms)0 05 1 15 2 25 3 35 4 45

179

180

181

182

183

184

185

186

Mak

espa

n

DMSCROTMSCRO

DRSCRO

times105

Figure 21 Convergence trace for the set of the randomly generatedDAGs with 50 tasks

and DMSCRO The simulation experimental results showthat DRSCRO converges faster than TMSCRO by 194 onaverage (by 293 in the best case) and faster than DMSCROby 339 on average (by 412 in the best case)

Moreover the statistical analysis based on the averagevalues achieved is also presented in Section 542 to prove

18 Mathematical Problems in Engineering

Table 12 Results of Friedman tests 120572 = 005

Method Algorithm 119901 value Hypothesis

Friedman testDRSCRO DMSCRO 253119864 minus 02 RejectedDRSCRO TMSCRO 253119864 minus 02 Rejected

DRSCRO DMSCRO TMSCRO 670119864 minus 03 Rejected

Table 13 Results of Quade tests 120572 = 005

Method Algorithm 119901 value Hypothesis

Quade testDRSCRO DMSCRO 132119864 minus 02 RejectedDRSCRO TMSCRO 132119864 minus 02 Rejected

DRSCRO DMSCRO TMSCRO 109119864 minus 03 Rejected

that DRSCRO outperforms the other CRO-based algorithmsfor DAG scheduling from a statistical point of view

542 Significant Tests Statistical analysis is necessary for theaverage coverage rates obtained in all cases by DRSCROTMSCRO and DMSCRO which are metaheuristic methodsin order to find significant differences among these resultsNonparametric tests according to the recommendationsin [40] are specifically considered to be used since theexperimental results may present neither normal distributionnor variance homogeneity Therefore the Friedman test andthe Quade test are applied to check whether significantdifferences exist in the performance between these threealgorithms A significance level 120572 = 005 is used in allstatistical tests

Tables 12 and 13 respectively list the test results of theFriedman test and the Quade test which both reject the nullhypothesis of equivalent performance In both of these twotests our proposedDRSCRO is not only compared against allthe algorithms but also compared against the remaining onesas the control methodThe results in Tables 12 and 13 validatethe significant differences 120572 in the performance of DRSCROTMSCRO and DMSCRO

In sum it could be concluded that DRSCRO which is thecontrol algorithm statistically outperforms the other CRO-based DAG scheduling algorithm on coverage rate with asignificant level of 005

6 Discussion

The experimental results of makespan tests show that theperformance of DRSCRO is very similar to the other similarkinds of metaheuristic algorithms because when averagedover all possible fitness functions each well-designed meta-heuristic algorithm has the same performance for searchingoptimal solutions according to No-Free-Lunch TheoremHowever the proposed DRSCRO can achieve better perfor-mance and find good solutions faster than the other similarkinds of metaheuristic algorithms as the experimental resultsof convergence tests and the reason for it as the analysis inthe last paragraph in Section 46 is that DRSCROhas a bettersuper molecule creation bymetaheuristic method and underthe consideration of the optimization of scheduling orderand processor assignment DRSCRO takes the advantages of

VNS algorithm in the global optimization phase to improvethe optimization capability A load balance neighborhoodstructure is also applied in the ineffective reaction operatorfor a better intensification capability The new processorselection model utilized in the neighborhood structures alsopromotes the efficiency of VNS algorithm

7 Conclusion and Future Study

An algorithm named Double-Reaction-Structured CRO(DRSCRO) is developed for DAG scheduling on heteroge-neous systems in this paper DRSCRO includes two reactionphases one for super molecule selection and another forglobal optimizationThe phase of super molecule selection isused to obtain a super molecule by the metaheuristic methodfor better convergence rate different from other CRO-basedalgorithms forDAG scheduling on heterogeneous systems Inaddition to promote the intersection capability of DRSCROthe VNS algorithm which is with a new model for processorselection utilized in the neighborhood structures is used asthe initialization of global optimization phase and the loadbalance neighborhood structure of VNS is also applied inthe ineffective reaction operator The experimental resultsshow that DRSCRO can also achieve a higher speedup thanthe other CRO-based algorithms as far as we know AndDRSCRO algorithm can also obtain better performance onaveragemakespan in some cases

In future work we will analyze the parameter sensitivityof DRSCRO for promoting its activeness Moreover to makethe proposed algorithmmore practical DRSCROwill be alsoextended to aim at twomain objectives such as (1) minimiza-tion of schedule length (time domain) and (2) minimizationof number of used processors (resource domain)

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

This work is financially supported by the National NaturalScience Foundation of China (Grant no 61462073) and

Mathematical Problems in Engineering 19

the National High Technology Research and DevelopmentProgramof China (863 Program) (Grant no 2015AA020107)

References

[1] T Lewis and H Elrewini Introduction to Parallel ComputingPrentice-Hall New York NY USA 1992

[2] Y-K Kwok and I Ahmad ldquoStatic scheduling algorithms forallocating directed task graphs to multiprocessorsrdquo ACM Com-puting Surveys vol 31 no 4 pp 406ndash471 1999

[3] H Topcuoglu S Hariri and M-Y Wu ldquoPerformance-effectiveand low-complexity task scheduling for heterogeneous comput-ingrdquo IEEE Transactions on Parallel and Distributed Systems vol13 no 3 pp 260ndash274 2002

[4] Y-K Kwok and I Ahmad ldquoDynamic critical-path schedulingan effective technique for allocating task graphs to multiproces-sorsrdquo IEEETransactions on Parallel andDistributed Systems vol7 no 5 pp 506ndash521 1996

[5] F Suter F Desprez and H Casanova ldquoFrom heterogeneoustask scheduling to heterogeneous mixed parallel schedulingrdquo inEuro-Par 2004 Parallel Processing 10th International Euro-ParConference Pisa Italy August 31- September 3 2004 Proceed-ings vol 3149 of Lecture Notes in Computer Science pp 230ndash237Springer Berlin Germany 2004

[6] C-Y Lee J J Hwang Y-C Chow and F D Anger ldquoMultipro-cessor scheduling with interprocessor communication delaysrdquoOperations Research Letters vol 7 no 3 pp 141ndash147 1988

[7] M Maheswaran and H J Siegel ldquoA dynamic matching andscheduling algorithm for heterogeneous computing systemsrdquo inProceedings of the 7th Heterogeneous Computing Workshop pp57ndash69 IEEE Computer Society Orlando Fla USAMarch 1998

[8] M-Y Wu and D D Gajski ldquoHypertool a programming aidformessage-passing systemsrdquo IEEE Transactions on Parallel andDistributed Systems vol 1 no 3 pp 330ndash343 1990

[9] H El-Rewini and T G Lewis ldquoScheduling parallel programtasks onto arbitrary target machinesrdquo Journal of Parallel andDistributed Computing vol 9 no 2 pp 138ndash153 1990

[10] G C Sih and E A Lee ldquoCompile-time scheduling heuristic forinterconnection-constrained heterogeneous processor archi-tecturesrdquo IEEE Transactions on Parallel andDistributed Systemsvol 4 no 2 pp 175ndash187 1993

[11] B Kruatrachue and T Lewis ldquoGrain size determination forparallel processingrdquo IEEE Software vol 5 no 1 pp 23ndash32 1988

[12] M A Al-Mouhamed ldquoLower bound on the number of proces-sors and time for scheduling precedence graphs with communi-cation costsrdquo IEEETransactions on Software Engineering vol 16no 12 pp 1390ndash1401 1990

[13] A Tumeo C Pilato F Ferrandi D Sciuto and P L LanzildquoAnt colony optimization for mapping and scheduling in het-erogeneous multiprocessor systemsrdquo in Proceedings of the Inter-national Conference on Embedded Computer Systems Architec-turesModeling and Simulation (SAMOS rsquo08) pp 142ndash149 IEEEComputer Society Samos Greece June 2008

[14] I Ahmad and M K Dhodhi ldquoMultiprocessor scheduling in agenetic paradigmrdquo Parallel Computing vol 22 no 3 pp 395ndash406 1996

[15] M R Bonyadi and M E Moghaddam ldquoA bipartite geneticalgorithm for multi-processor task schedulingrdquo InternationalJournal of Parallel Programming vol 37 no 5 pp 462ndash487 2009

[16] R C Correa A Ferreira and P Rebreyend ldquoScheduling multi-processor tasks with genetic algorithmsrdquo IEEE Transactions onParallel andDistributed Systems vol 10 no 8 pp 825ndash837 1999

[17] E S H Hou N Ansari and H Ren ldquoGenetic algorithm formultiprocessor schedulingrdquo IEEE Transactions on Parallel andDistributed Systems vol 5 no 2 pp 113ndash120 1994

[18] F A Omara and M M Arafa ldquoGenetic algorithms for taskscheduling problemrdquo Journal of Parallel and Distributed Com-puting vol 70 no 1 pp 13ndash22 2010

[19] A Singh M Sevaux and A Rossi ldquoA hybrid grouping geneticalgorithm for multiprocessor schedulingrdquo in ContemporaryComputing Second International Conference IC3 2009 NoidaIndia August 17ndash19 2009 Proceedings vol 40 of Communica-tions in Computer and Information Science pp 1ndash7 SpringerBerlin Germany 2009

[20] A S Wu H Yu S Jin K-C Lin and G Schiavone ldquoAnincremental genetic algorithm approach to multiprocessorschedulingrdquo IEEE Transactions on Parallel and DistributedSystems vol 15 no 9 pp 824ndash834 2004

[21] Y Wen H Xu and J Yang ldquoA heuristic-based hybrid genetic-variable neighborhood search algorithm for task schedulingin heterogeneous multiprocessor systemrdquo Information Sciencesvol 181 no 3 pp 567ndash581 2011

[22] R Shanmugapriya S Padmavathi and S Shalinie ldquoContentionawareness in task scheduling using Tabu searchrdquo in Proceedingsof the IEEE International Advance Computing Conference pp272ndash277 IEEE Computer Society Patiala India March 2009

[23] S C Porto and C C Ribeiro ldquoA tabu search approach totask scheduling on heterogeneous processors under precedenceconstraintsrdquo International Journal of High Speed Computing vol7 no 1 pp 45ndash72 1995

[24] A V Kalashnikov and V A Kostenko ldquoA parallel algorithm ofsimulated annealing for multiprocessor schedulingrdquo Journal ofComputer and Systems Sciences International vol 47 no 3 pp455ndash463 2008

[25] A Y S Lam and V O K Li ldquoChemical-reaction-inspiredmeta-heuristic for optimizationrdquo IEEE Transactions on EvolutionaryComputation vol 14 no 3 pp 381ndash399 2010

[26] P Choudhury R Kumar and P P Chakrabarti ldquoHybridscheduling of dynamic task graphs with selective duplicationfor multiprocessors under memory and time constraintsrdquo IEEETransactions on Parallel and Distributed Systems vol 19 no 7pp 967ndash980 2008

[27] Y Xu K Li LHe andT K Truong ldquoADAG scheduling schemeon heterogeneous computing systems using double molecularstructure-based chemical reaction optimizationrdquo Journal ofParallel andDistributed Computing vol 73 no 9 pp 1306ndash13222013

[28] J Xu and A Lam ldquoChemical reaction optimization for the gridscheduling problemrdquo in Proceedings of the IEEE InternationalConference on Communications pp 1ndash5 IEEE Computer Soci-ety Cape Town South Africa May 2010

[29] B Varghese G McKee and V Alexandrov ldquoCan agent intel-ligence be used to achieve fault tolerant parallel computingsystemsrdquo Parallel Processing Letters vol 21 no 4 pp 379ndash3962011

[30] Y Jiang Z Shao and Y Guo ldquoA DAG scheduling scheme onheterogeneous computing systems using tuple-based chemicalreaction optimizationrdquo Scientific World Journal vol 2014 Arti-cle ID 404375 23 pages 2014

[31] J Xu AY S Lam and V O K Li ldquoStock portfolio selectionusing chemical reaction optimizationrdquo in Proceedings of theInternational Conference on Operations Research and FinancialEngineering pp 458ndash463 WASET Paris France June 2011

20 Mathematical Problems in Engineering

[32] N Mladenovic and P Hansen ldquoVariable neighborhood searchrdquoComputers amp Operations Research vol 24 no 11 pp 1097ndash11001997

[33] K Li X Tang and K Li ldquoEnergy-efficient stochastic taskscheduling on heterogeneous computing systemsrdquo IEEE Trans-actions on Parallel and Distributed Systems vol 25 no 11 pp2867ndash2876 2014

[34] D HWolpert andWGMacready ldquoNo free lunch theorems foroptimizationrdquo IEEE Transactions on Evolutionary Computationvol 1 no 1 pp 67ndash82 1997

[35] M A Khan ldquoScheduling for heterogeneous Systems usingconstrained critical pathsrdquo Parallel Computing vol 38 no 4-5pp 175ndash193 2012

[36] P Hansen N Mladenovic and J M Perez ldquoVariable neigh-borhood search methods and applicationsrdquo 4OR-A QuarterlyJournal of Operations Research vol 6 no 4 pp 319ndash360 2008

[37] F Glover and G Kochenberger Handbook of MetaheuristicsSpringer New York NY USA 2003

[38] S Kim and J Browne ldquoA general approach to mapping of par-allel computation upon multiprocessor architecturesrdquo in Pro-ceedings of the International Conference on Parallel Processingpp 1ndash8 Pennsylvania State University Pennsylvania State Uni-versity Press University Park Pa USA August 1988

[39] V A F Almeida I M M Vasconcelos J N C Arabe and D AMenasce ldquoUsing random task graphs to investigate the potentialbenefits of heterogeneity in parallel systemsrdquo in Proceedingsof the ACMIEEE Conference on Supercomputing pp 683ndash691IEEE Computer Society Minneapolis Minn USA November1992

[40] S Garcıa A Fernandez J Luengo and F Herrera ldquoAdvancednonparametric tests for multiple comparisons in the design ofexperiments in computational intelligence and data miningexperimental analysis of powerrdquo Information Sciences vol 180no 10 pp 2044ndash2064 2010

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 14: Research Article DRSCRO: A Metaheuristic Algorithm for ...downloads.hindawi.com/journals/mpe/2015/396582.pdf · Research Article DRSCRO: A Metaheuristic Algorithm for Task Scheduling

14 Mathematical Problems in Engineering

0

20

40

60

80

100

120

140

CCR = 10

Mak

espa

n

HEFTDMSCRO

TMSCRODRSCRO

|P| = 4 |P| = 8 |P| = 16 |P| = 32

Figure 10 Average makespan for the molecular dynamics codeCCR = 10

0

20

40

60

80

100

120

140

CCR = 02

Mak

espa

n

HEFTDMSCRO

TMSCRODRSCRO

|P| = 4 |P| = 8 |P| = 16 |P| = 32

Figure 11 Average makespan for Gaussian elimination CCR = 02

of DRSCRO the best final value achieved in all these runsthe worst final value and the related standard deviation orvariance are also presented

531 Real-World Application Graphs Figures 10ndash13 showthe simulation experiment results of DRSCRO DMSCROTMSCRO and HEFT on the real-world application graphsand Tables 4ndash7 list the detail of the experimental results

As shown in Figures 10 and 11 it can be observed thatthe average makespan decreases as the processor numberincreases The results also show that DRSCRO TMSCROand DMSCRO achieve very similar performance which areall metaheuristic methods It is because according to No-Free-Lunch Theorem all well-designed metaheuristic meth-ods have the same performance on searching for optimalsolutions when averaged over all possible fitness functionsThe TMSCRO andDMSCROused in the simulation are well-designed and taken from the literature Therefore it provedthat DRSCRO developed in our work is also well-designed

A close observation of the results in Tables 10 and 11shows that DRSCRO outperforms TMSCRO and DMSCROon average slightly The reason is only because DRSCRO hasbetter capability of intensification search by applying VNS

01 02 1 2 50

50100150200250300350400450

CCR

Mak

espa

n

HEFTDMSCRO

TMSCRODRSCRO

Figure 12 Average makespan for the molecular dynamics code theprocessors number is 16

01 02 1 2 50

50100150200250300350400450

CCR

Mak

espa

n

HEFTDMSCRO

TMSCRODRSCRO

Figure 13 Average makespan for Gaussian elimination the proces-sors number is 8

and the utilization of one of its neighborhood structuresin the ineffective reaction operator as presented in the lastparagraph of Section 46 Therefore the performance of theaverage results obtained by DRSCRO is better than thatobtained by TMSCRO and DMSCRO when the stoppingcriterion is satisfied Moreover DRSCRO TMSCRO andDMSCRO typically outperform HEFT because they search awider area of the solution space as metaheuristic methodswhile the search of HEFT is narrowed down to a very smallerportion by means of the heuristics

Figures 12 and 13 and the results in Tables 7 and 8show the performance of the experimental results of thesefour algorithms with CCR value increasing It can be seenthat the makespan on average increases with the CCR valueincreasing It is because the heterogeneous processors are inthe idle state for longer as a result of the DAGs becomingmore communication-intensive It also can be observed thatDRSCRO TMSCRO and DMSCRO outperform HEFT andthe advantage becomes more significant with the value ofCCR increasing which suggest that heuristic algorithm likeHEFT has less consistent performance in a wide scheduling

Mathematical Problems in Engineering 15

Table 5 Experiment results for the molecular dynamics code CCR = 10

The number ofprocessors

HEFT(averagemakespan)

DMSCRO(averagemakespan)

TMSCRO(averagemakespan)

DRSCRO(averagemakespan)

DRSCRO(best

makespan)

DRSCRO(worst

makespan)

DRSCRO(variance)

4 120086 117624 116993 116759 114365 119589 24878 120074 116633 116007 115775 113402 118582 246616 120031 116101 115478 115247 112885 118041 245532 119993 115476 114856 114529 112182 117306 2440

Table 6 Experiment results for Gaussian elimination graph CCR = 02

The number ofprocessors

HEFT(averagemakespan)

DMSCRO(averagemakespan)

TMSCRO(averagemakespan)

DRSCRO(averagemakespan)

DRSCRO(best

makespan)

DRSCRO(worst

makespan)

DRSCRO(variance)

4 126852 124252 123585 123337 122042 126327 17698 96952 94965 94455 94266 92333 96551 200816 82356 80668 80235 80074 78433 82015 170632 75630 74080 73682 73535 72027 75317 1566

Table 7 Experiment results for the molecular dynamics code the processors number is 16

The value ofCCR

HEFT(averagemakespan)

DMSCRO(averagemakespan)

TMSCRO(averagemakespan)

DRSCRO(averagemakespan)

DRSCRO(best

makespan)

DRSCRO(worst

makespan)

DRSCRO(variance)

01 111782 109491 108903 108685 106457 111320 231502 112034 109737 109148 108930 106697 111571 23201 120031 116101 115478 115247 112885 118041 24552 160760 157465 156619 156306 153102 158532 30345 414581 406082 403902 403095 400878 404805 2125

Table 8 Experiment results for Gaussian elimination graph the processors number is 8

The value ofCCR

HEFT(averagemakespan)

DMSCRO(averagemakespan)

TMSCRO(averagemakespan)

DRSCRO(averagemakespan)

DRSCRO(best

makespan)

DRSCRO(worst

makespan)

DRSCRO(variance)

01 96612 94632 94124 93935 92010 96212 200102 96952 94965 94455 94266 92333 96551 20081 131094 128407 127717 127462 124849 130552 27152 222635 218071 216900 216467 214194 219550 24565 403600 395327 393204 392418 390260 394083 2069

scenario range and metaheuristic algorithm performs moreeffectively for communication-intensive DAGs

532 Randomly Generated Application Graphs As shownin Figures 14ndash16 randomly generated DAGs are used toevaluate the performance of DRSCRO TMSCRO DMSCROand HEFT in these experiments And the details of theseexperimental results are listed in Tables 9ndash11

Figure 14 shows the performance on the experimentalresults of these four algorithms with the processor numberincreasing As shown in Figure 14 DRSCRO always out-performs TMSCRO DMSCRO and HEFT as the numberof processors increases Figure 15 shows that DMSCRO has

better performance than the other three algorithms as thetask number increases The reasons for these are similarto those explained in the third paragraph of Section 531Figure 16 shows the makespan on average with CCR valuesincreasing It can be seem that the averagemakespan increasesrapidly with the increasing of the value of CCR As shown inFigure 16 the makespan on average increases rapidly whenthe value of CCR rises It is the fact that the DAG becomesmore communication-intensive with CCR increasing whichleads to the processors staying in the idle state for longer

54 Convergence Tests In this section the convergenceexperiments are conducted to show the change of makespan

16 Mathematical Problems in Engineering

Table 9 Experiment results for random graphs under different processor numbers task number is 50 and CCR = 02

The number ofprocessors

HEFT(averagemakespan)

DMSCRO(averagemakespan)

TMSCRO(averagemakespan)

DRSCRO(averagemakespan)

DRSCRO(best

makespan)

DRSCRO(worst

makespan)

DRSCRO(variance)

4 149400 146337 145552 145261 142283 148782 30948 119511 117061 116433 116200 113818 119017 247516 119473 117024 116396 116163 113782 118979 247432 119468 115550 114929 114700 112348 117480 2443

Table 10 Experiment results for random graphs under different task numbers processors number is 32 and CCR = 10

The numberof tasks

HEFT(averagemakespan)

DMSCRO(averagemakespan)

TMSCRO(averagemakespan)

DRSCRO(averagemakespan)

DRSCRO(best

makespan)

DRSCRO(worst

makespan)

DRSCRO(variance)

10 714484 706149 699100 690708 688982 693638 202520 1100918 1089685 1080428 1069082 1066410 1071479 261950 1666373 1662543 1649988 1634230 1633415 1636261 1165

Table 11 Experiment results of DRSCRO for the random graph under different CCRs the task number is 50

Value of CCR Processors number is 4 Processors number is 8 Processors number is 16 Processors number is 3201 145093 116068 116058 11459202 145261 116200 116163 1147001 153248 120228 119511 1185342 194401 166014 164246 1622925 508759 498427 492049 487830

0

50

100

150

CCR = 02

Mak

espa

n

HEFTDMSCRO

TMSCRODRSCRO

|P| = 4 |P| = 8 |P| = 16 |P| = 32

Figure 14 Average makespan for random graphs under differentprocessor numbers task number is 50 and CCR = 02

among DRSCRO TMSCRO and DMSCRO The conver-gence traces and significant tests are to further reveal thedifferences between DRSCRO and the other two algorithmsIn these experiments as suggested in [27] the stoppingcriteria of these three algorithms are that the total runningtime reaches a setting value (eg 180 s) Under the consid-eration of comparability the beginning of the time countingof DRSCRO is set as the start of the global optimizationphase processing In the first extensive test bed themakespanis averaged over 10 independent runs while in the secondextensive test bed the makespan is averaged over 30 differentrandom graph running instances

10 20 500

200400600800

10001200140016001800

The number of tasks

Mak

espa

n

HEFTDMSCRO

TMSCRODRSCRO

Figure 15 Average makespan for random graphs under differenttask numbers processors number is 32 and CCR = 10

541 Convergence Trace The convergence traces ofDRSCRO TMSCRO and DMSCRO for processing themolecular dynamics code and Gaussian elimination areplotted in Figures 17 and 18 respectively Figures 19ndash21show the convergence traces when processing the randomlygenerated DAG sets of which each contains 10 20 and50 tasks respectively As shown in Figures 17ndash21 it canbe observed that the convergence traces of these threealgorithms have obvious differences And the DRSCROconverges faster than the other two algorithms in every caseThe reason for the better rate of convergence of DRSCRO is as

Mathematical Problems in Engineering 17

01 02 1 2 50

100

200

300

400

500

600

CCR

Mak

espa

n

P = 32

P = 16P = 8P = 4

Figure 16 Average makespan of DRSCRO for the random graphunder different CCRs the task number is 50

0 1 2 3 4 5 6 7 8 9134

135

136

137

138

139

140

141

Time (Ms)

Mak

espa

n

DMSCROTMSCRO

DRSCRO

times104

Figure 17 Convergence trace for the molecular dynamics codeCCR = 1 and the number of processors is 16

Time (Ms)0 1 2 3 4 5 6 7 8 9

167

168

169

170

171

172

173

174

Mak

espa

n

DMSCROTMSCRO

DRSCRO

times104

Figure 18 Convergence trace for Gaussian elimination CCR = 02and the number of processors is 8

presented in the last paragraph of Section 46 (ie DRSCROtakes the advantage of its double-reaction structure toobtain a better super molecule for accelerating convergence)Even though the VNS algorithm adds the time cost in eachiteration the enhanced optimization capability of DRSCROalso makes it obtain a better coverage rate than TMSCRO

Time (Ms)0 1 2 3 4 5 6 7

505

51

515

52

525

53

535

54

Mak

espa

n

DMSCROTMSCRO

DRSCRO

times104

Figure 19 Convergence trace for the set of the randomly generatedDAGs with 10 tasks

Time (Ms)0 2 4 6 8 10 12 14

65566

66567

67568

68569

69570

705

Mak

espa

n

DMSCROTMSCRO

DRSCRO

times104

Figure 20 Convergence trace for the set of the randomly generatedDAGs with 20 tasks

Time (Ms)0 05 1 15 2 25 3 35 4 45

179

180

181

182

183

184

185

186

Mak

espa

n

DMSCROTMSCRO

DRSCRO

times105

Figure 21 Convergence trace for the set of the randomly generatedDAGs with 50 tasks

and DMSCRO The simulation experimental results showthat DRSCRO converges faster than TMSCRO by 194 onaverage (by 293 in the best case) and faster than DMSCROby 339 on average (by 412 in the best case)

Moreover the statistical analysis based on the averagevalues achieved is also presented in Section 542 to prove

18 Mathematical Problems in Engineering

Table 12 Results of Friedman tests 120572 = 005

Method Algorithm 119901 value Hypothesis

Friedman testDRSCRO DMSCRO 253119864 minus 02 RejectedDRSCRO TMSCRO 253119864 minus 02 Rejected

DRSCRO DMSCRO TMSCRO 670119864 minus 03 Rejected

Table 13 Results of Quade tests 120572 = 005

Method Algorithm 119901 value Hypothesis

Quade testDRSCRO DMSCRO 132119864 minus 02 RejectedDRSCRO TMSCRO 132119864 minus 02 Rejected

DRSCRO DMSCRO TMSCRO 109119864 minus 03 Rejected

that DRSCRO outperforms the other CRO-based algorithmsfor DAG scheduling from a statistical point of view

542 Significant Tests Statistical analysis is necessary for theaverage coverage rates obtained in all cases by DRSCROTMSCRO and DMSCRO which are metaheuristic methodsin order to find significant differences among these resultsNonparametric tests according to the recommendationsin [40] are specifically considered to be used since theexperimental results may present neither normal distributionnor variance homogeneity Therefore the Friedman test andthe Quade test are applied to check whether significantdifferences exist in the performance between these threealgorithms A significance level 120572 = 005 is used in allstatistical tests

Tables 12 and 13 respectively list the test results of theFriedman test and the Quade test which both reject the nullhypothesis of equivalent performance In both of these twotests our proposedDRSCRO is not only compared against allthe algorithms but also compared against the remaining onesas the control methodThe results in Tables 12 and 13 validatethe significant differences 120572 in the performance of DRSCROTMSCRO and DMSCRO

In sum it could be concluded that DRSCRO which is thecontrol algorithm statistically outperforms the other CRO-based DAG scheduling algorithm on coverage rate with asignificant level of 005

6 Discussion

The experimental results of makespan tests show that theperformance of DRSCRO is very similar to the other similarkinds of metaheuristic algorithms because when averagedover all possible fitness functions each well-designed meta-heuristic algorithm has the same performance for searchingoptimal solutions according to No-Free-Lunch TheoremHowever the proposed DRSCRO can achieve better perfor-mance and find good solutions faster than the other similarkinds of metaheuristic algorithms as the experimental resultsof convergence tests and the reason for it as the analysis inthe last paragraph in Section 46 is that DRSCROhas a bettersuper molecule creation bymetaheuristic method and underthe consideration of the optimization of scheduling orderand processor assignment DRSCRO takes the advantages of

VNS algorithm in the global optimization phase to improvethe optimization capability A load balance neighborhoodstructure is also applied in the ineffective reaction operatorfor a better intensification capability The new processorselection model utilized in the neighborhood structures alsopromotes the efficiency of VNS algorithm

7 Conclusion and Future Study

An algorithm named Double-Reaction-Structured CRO(DRSCRO) is developed for DAG scheduling on heteroge-neous systems in this paper DRSCRO includes two reactionphases one for super molecule selection and another forglobal optimizationThe phase of super molecule selection isused to obtain a super molecule by the metaheuristic methodfor better convergence rate different from other CRO-basedalgorithms forDAG scheduling on heterogeneous systems Inaddition to promote the intersection capability of DRSCROthe VNS algorithm which is with a new model for processorselection utilized in the neighborhood structures is used asthe initialization of global optimization phase and the loadbalance neighborhood structure of VNS is also applied inthe ineffective reaction operator The experimental resultsshow that DRSCRO can also achieve a higher speedup thanthe other CRO-based algorithms as far as we know AndDRSCRO algorithm can also obtain better performance onaveragemakespan in some cases

In future work we will analyze the parameter sensitivityof DRSCRO for promoting its activeness Moreover to makethe proposed algorithmmore practical DRSCROwill be alsoextended to aim at twomain objectives such as (1) minimiza-tion of schedule length (time domain) and (2) minimizationof number of used processors (resource domain)

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

This work is financially supported by the National NaturalScience Foundation of China (Grant no 61462073) and

Mathematical Problems in Engineering 19

the National High Technology Research and DevelopmentProgramof China (863 Program) (Grant no 2015AA020107)

References

[1] T Lewis and H Elrewini Introduction to Parallel ComputingPrentice-Hall New York NY USA 1992

[2] Y-K Kwok and I Ahmad ldquoStatic scheduling algorithms forallocating directed task graphs to multiprocessorsrdquo ACM Com-puting Surveys vol 31 no 4 pp 406ndash471 1999

[3] H Topcuoglu S Hariri and M-Y Wu ldquoPerformance-effectiveand low-complexity task scheduling for heterogeneous comput-ingrdquo IEEE Transactions on Parallel and Distributed Systems vol13 no 3 pp 260ndash274 2002

[4] Y-K Kwok and I Ahmad ldquoDynamic critical-path schedulingan effective technique for allocating task graphs to multiproces-sorsrdquo IEEETransactions on Parallel andDistributed Systems vol7 no 5 pp 506ndash521 1996

[5] F Suter F Desprez and H Casanova ldquoFrom heterogeneoustask scheduling to heterogeneous mixed parallel schedulingrdquo inEuro-Par 2004 Parallel Processing 10th International Euro-ParConference Pisa Italy August 31- September 3 2004 Proceed-ings vol 3149 of Lecture Notes in Computer Science pp 230ndash237Springer Berlin Germany 2004

[6] C-Y Lee J J Hwang Y-C Chow and F D Anger ldquoMultipro-cessor scheduling with interprocessor communication delaysrdquoOperations Research Letters vol 7 no 3 pp 141ndash147 1988

[7] M Maheswaran and H J Siegel ldquoA dynamic matching andscheduling algorithm for heterogeneous computing systemsrdquo inProceedings of the 7th Heterogeneous Computing Workshop pp57ndash69 IEEE Computer Society Orlando Fla USAMarch 1998

[8] M-Y Wu and D D Gajski ldquoHypertool a programming aidformessage-passing systemsrdquo IEEE Transactions on Parallel andDistributed Systems vol 1 no 3 pp 330ndash343 1990

[9] H El-Rewini and T G Lewis ldquoScheduling parallel programtasks onto arbitrary target machinesrdquo Journal of Parallel andDistributed Computing vol 9 no 2 pp 138ndash153 1990

[10] G C Sih and E A Lee ldquoCompile-time scheduling heuristic forinterconnection-constrained heterogeneous processor archi-tecturesrdquo IEEE Transactions on Parallel andDistributed Systemsvol 4 no 2 pp 175ndash187 1993

[11] B Kruatrachue and T Lewis ldquoGrain size determination forparallel processingrdquo IEEE Software vol 5 no 1 pp 23ndash32 1988

[12] M A Al-Mouhamed ldquoLower bound on the number of proces-sors and time for scheduling precedence graphs with communi-cation costsrdquo IEEETransactions on Software Engineering vol 16no 12 pp 1390ndash1401 1990

[13] A Tumeo C Pilato F Ferrandi D Sciuto and P L LanzildquoAnt colony optimization for mapping and scheduling in het-erogeneous multiprocessor systemsrdquo in Proceedings of the Inter-national Conference on Embedded Computer Systems Architec-turesModeling and Simulation (SAMOS rsquo08) pp 142ndash149 IEEEComputer Society Samos Greece June 2008

[14] I Ahmad and M K Dhodhi ldquoMultiprocessor scheduling in agenetic paradigmrdquo Parallel Computing vol 22 no 3 pp 395ndash406 1996

[15] M R Bonyadi and M E Moghaddam ldquoA bipartite geneticalgorithm for multi-processor task schedulingrdquo InternationalJournal of Parallel Programming vol 37 no 5 pp 462ndash487 2009

[16] R C Correa A Ferreira and P Rebreyend ldquoScheduling multi-processor tasks with genetic algorithmsrdquo IEEE Transactions onParallel andDistributed Systems vol 10 no 8 pp 825ndash837 1999

[17] E S H Hou N Ansari and H Ren ldquoGenetic algorithm formultiprocessor schedulingrdquo IEEE Transactions on Parallel andDistributed Systems vol 5 no 2 pp 113ndash120 1994

[18] F A Omara and M M Arafa ldquoGenetic algorithms for taskscheduling problemrdquo Journal of Parallel and Distributed Com-puting vol 70 no 1 pp 13ndash22 2010

[19] A Singh M Sevaux and A Rossi ldquoA hybrid grouping geneticalgorithm for multiprocessor schedulingrdquo in ContemporaryComputing Second International Conference IC3 2009 NoidaIndia August 17ndash19 2009 Proceedings vol 40 of Communica-tions in Computer and Information Science pp 1ndash7 SpringerBerlin Germany 2009

[20] A S Wu H Yu S Jin K-C Lin and G Schiavone ldquoAnincremental genetic algorithm approach to multiprocessorschedulingrdquo IEEE Transactions on Parallel and DistributedSystems vol 15 no 9 pp 824ndash834 2004

[21] Y Wen H Xu and J Yang ldquoA heuristic-based hybrid genetic-variable neighborhood search algorithm for task schedulingin heterogeneous multiprocessor systemrdquo Information Sciencesvol 181 no 3 pp 567ndash581 2011

[22] R Shanmugapriya S Padmavathi and S Shalinie ldquoContentionawareness in task scheduling using Tabu searchrdquo in Proceedingsof the IEEE International Advance Computing Conference pp272ndash277 IEEE Computer Society Patiala India March 2009

[23] S C Porto and C C Ribeiro ldquoA tabu search approach totask scheduling on heterogeneous processors under precedenceconstraintsrdquo International Journal of High Speed Computing vol7 no 1 pp 45ndash72 1995

[24] A V Kalashnikov and V A Kostenko ldquoA parallel algorithm ofsimulated annealing for multiprocessor schedulingrdquo Journal ofComputer and Systems Sciences International vol 47 no 3 pp455ndash463 2008

[25] A Y S Lam and V O K Li ldquoChemical-reaction-inspiredmeta-heuristic for optimizationrdquo IEEE Transactions on EvolutionaryComputation vol 14 no 3 pp 381ndash399 2010

[26] P Choudhury R Kumar and P P Chakrabarti ldquoHybridscheduling of dynamic task graphs with selective duplicationfor multiprocessors under memory and time constraintsrdquo IEEETransactions on Parallel and Distributed Systems vol 19 no 7pp 967ndash980 2008

[27] Y Xu K Li LHe andT K Truong ldquoADAG scheduling schemeon heterogeneous computing systems using double molecularstructure-based chemical reaction optimizationrdquo Journal ofParallel andDistributed Computing vol 73 no 9 pp 1306ndash13222013

[28] J Xu and A Lam ldquoChemical reaction optimization for the gridscheduling problemrdquo in Proceedings of the IEEE InternationalConference on Communications pp 1ndash5 IEEE Computer Soci-ety Cape Town South Africa May 2010

[29] B Varghese G McKee and V Alexandrov ldquoCan agent intel-ligence be used to achieve fault tolerant parallel computingsystemsrdquo Parallel Processing Letters vol 21 no 4 pp 379ndash3962011

[30] Y Jiang Z Shao and Y Guo ldquoA DAG scheduling scheme onheterogeneous computing systems using tuple-based chemicalreaction optimizationrdquo Scientific World Journal vol 2014 Arti-cle ID 404375 23 pages 2014

[31] J Xu AY S Lam and V O K Li ldquoStock portfolio selectionusing chemical reaction optimizationrdquo in Proceedings of theInternational Conference on Operations Research and FinancialEngineering pp 458ndash463 WASET Paris France June 2011

20 Mathematical Problems in Engineering

[32] N Mladenovic and P Hansen ldquoVariable neighborhood searchrdquoComputers amp Operations Research vol 24 no 11 pp 1097ndash11001997

[33] K Li X Tang and K Li ldquoEnergy-efficient stochastic taskscheduling on heterogeneous computing systemsrdquo IEEE Trans-actions on Parallel and Distributed Systems vol 25 no 11 pp2867ndash2876 2014

[34] D HWolpert andWGMacready ldquoNo free lunch theorems foroptimizationrdquo IEEE Transactions on Evolutionary Computationvol 1 no 1 pp 67ndash82 1997

[35] M A Khan ldquoScheduling for heterogeneous Systems usingconstrained critical pathsrdquo Parallel Computing vol 38 no 4-5pp 175ndash193 2012

[36] P Hansen N Mladenovic and J M Perez ldquoVariable neigh-borhood search methods and applicationsrdquo 4OR-A QuarterlyJournal of Operations Research vol 6 no 4 pp 319ndash360 2008

[37] F Glover and G Kochenberger Handbook of MetaheuristicsSpringer New York NY USA 2003

[38] S Kim and J Browne ldquoA general approach to mapping of par-allel computation upon multiprocessor architecturesrdquo in Pro-ceedings of the International Conference on Parallel Processingpp 1ndash8 Pennsylvania State University Pennsylvania State Uni-versity Press University Park Pa USA August 1988

[39] V A F Almeida I M M Vasconcelos J N C Arabe and D AMenasce ldquoUsing random task graphs to investigate the potentialbenefits of heterogeneity in parallel systemsrdquo in Proceedingsof the ACMIEEE Conference on Supercomputing pp 683ndash691IEEE Computer Society Minneapolis Minn USA November1992

[40] S Garcıa A Fernandez J Luengo and F Herrera ldquoAdvancednonparametric tests for multiple comparisons in the design ofexperiments in computational intelligence and data miningexperimental analysis of powerrdquo Information Sciences vol 180no 10 pp 2044ndash2064 2010

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 15: Research Article DRSCRO: A Metaheuristic Algorithm for ...downloads.hindawi.com/journals/mpe/2015/396582.pdf · Research Article DRSCRO: A Metaheuristic Algorithm for Task Scheduling

Mathematical Problems in Engineering 15

Table 5 Experiment results for the molecular dynamics code CCR = 10

The number ofprocessors

HEFT(averagemakespan)

DMSCRO(averagemakespan)

TMSCRO(averagemakespan)

DRSCRO(averagemakespan)

DRSCRO(best

makespan)

DRSCRO(worst

makespan)

DRSCRO(variance)

4 120086 117624 116993 116759 114365 119589 24878 120074 116633 116007 115775 113402 118582 246616 120031 116101 115478 115247 112885 118041 245532 119993 115476 114856 114529 112182 117306 2440

Table 6 Experiment results for Gaussian elimination graph CCR = 02

The number ofprocessors

HEFT(averagemakespan)

DMSCRO(averagemakespan)

TMSCRO(averagemakespan)

DRSCRO(averagemakespan)

DRSCRO(best

makespan)

DRSCRO(worst

makespan)

DRSCRO(variance)

4 126852 124252 123585 123337 122042 126327 17698 96952 94965 94455 94266 92333 96551 200816 82356 80668 80235 80074 78433 82015 170632 75630 74080 73682 73535 72027 75317 1566

Table 7 Experiment results for the molecular dynamics code the processors number is 16

The value ofCCR

HEFT(averagemakespan)

DMSCRO(averagemakespan)

TMSCRO(averagemakespan)

DRSCRO(averagemakespan)

DRSCRO(best

makespan)

DRSCRO(worst

makespan)

DRSCRO(variance)

01 111782 109491 108903 108685 106457 111320 231502 112034 109737 109148 108930 106697 111571 23201 120031 116101 115478 115247 112885 118041 24552 160760 157465 156619 156306 153102 158532 30345 414581 406082 403902 403095 400878 404805 2125

Table 8 Experiment results for Gaussian elimination graph the processors number is 8

The value ofCCR

HEFT(averagemakespan)

DMSCRO(averagemakespan)

TMSCRO(averagemakespan)

DRSCRO(averagemakespan)

DRSCRO(best

makespan)

DRSCRO(worst

makespan)

DRSCRO(variance)

01 96612 94632 94124 93935 92010 96212 200102 96952 94965 94455 94266 92333 96551 20081 131094 128407 127717 127462 124849 130552 27152 222635 218071 216900 216467 214194 219550 24565 403600 395327 393204 392418 390260 394083 2069

scenario range and metaheuristic algorithm performs moreeffectively for communication-intensive DAGs

532 Randomly Generated Application Graphs As shownin Figures 14ndash16 randomly generated DAGs are used toevaluate the performance of DRSCRO TMSCRO DMSCROand HEFT in these experiments And the details of theseexperimental results are listed in Tables 9ndash11

Figure 14 shows the performance on the experimentalresults of these four algorithms with the processor numberincreasing As shown in Figure 14 DRSCRO always out-performs TMSCRO DMSCRO and HEFT as the numberof processors increases Figure 15 shows that DMSCRO has

better performance than the other three algorithms as thetask number increases The reasons for these are similarto those explained in the third paragraph of Section 531Figure 16 shows the makespan on average with CCR valuesincreasing It can be seem that the averagemakespan increasesrapidly with the increasing of the value of CCR As shown inFigure 16 the makespan on average increases rapidly whenthe value of CCR rises It is the fact that the DAG becomesmore communication-intensive with CCR increasing whichleads to the processors staying in the idle state for longer

54 Convergence Tests In this section the convergenceexperiments are conducted to show the change of makespan

16 Mathematical Problems in Engineering

Table 9 Experiment results for random graphs under different processor numbers task number is 50 and CCR = 02

The number ofprocessors

HEFT(averagemakespan)

DMSCRO(averagemakespan)

TMSCRO(averagemakespan)

DRSCRO(averagemakespan)

DRSCRO(best

makespan)

DRSCRO(worst

makespan)

DRSCRO(variance)

4 149400 146337 145552 145261 142283 148782 30948 119511 117061 116433 116200 113818 119017 247516 119473 117024 116396 116163 113782 118979 247432 119468 115550 114929 114700 112348 117480 2443

Table 10 Experiment results for random graphs under different task numbers processors number is 32 and CCR = 10

The numberof tasks

HEFT(averagemakespan)

DMSCRO(averagemakespan)

TMSCRO(averagemakespan)

DRSCRO(averagemakespan)

DRSCRO(best

makespan)

DRSCRO(worst

makespan)

DRSCRO(variance)

10 714484 706149 699100 690708 688982 693638 202520 1100918 1089685 1080428 1069082 1066410 1071479 261950 1666373 1662543 1649988 1634230 1633415 1636261 1165

Table 11 Experiment results of DRSCRO for the random graph under different CCRs the task number is 50

Value of CCR Processors number is 4 Processors number is 8 Processors number is 16 Processors number is 3201 145093 116068 116058 11459202 145261 116200 116163 1147001 153248 120228 119511 1185342 194401 166014 164246 1622925 508759 498427 492049 487830

0

50

100

150

CCR = 02

Mak

espa

n

HEFTDMSCRO

TMSCRODRSCRO

|P| = 4 |P| = 8 |P| = 16 |P| = 32

Figure 14 Average makespan for random graphs under differentprocessor numbers task number is 50 and CCR = 02

among DRSCRO TMSCRO and DMSCRO The conver-gence traces and significant tests are to further reveal thedifferences between DRSCRO and the other two algorithmsIn these experiments as suggested in [27] the stoppingcriteria of these three algorithms are that the total runningtime reaches a setting value (eg 180 s) Under the consid-eration of comparability the beginning of the time countingof DRSCRO is set as the start of the global optimizationphase processing In the first extensive test bed themakespanis averaged over 10 independent runs while in the secondextensive test bed the makespan is averaged over 30 differentrandom graph running instances

10 20 500

200400600800

10001200140016001800

The number of tasks

Mak

espa

n

HEFTDMSCRO

TMSCRODRSCRO

Figure 15 Average makespan for random graphs under differenttask numbers processors number is 32 and CCR = 10

541 Convergence Trace The convergence traces ofDRSCRO TMSCRO and DMSCRO for processing themolecular dynamics code and Gaussian elimination areplotted in Figures 17 and 18 respectively Figures 19ndash21show the convergence traces when processing the randomlygenerated DAG sets of which each contains 10 20 and50 tasks respectively As shown in Figures 17ndash21 it canbe observed that the convergence traces of these threealgorithms have obvious differences And the DRSCROconverges faster than the other two algorithms in every caseThe reason for the better rate of convergence of DRSCRO is as

Mathematical Problems in Engineering 17

01 02 1 2 50

100

200

300

400

500

600

CCR

Mak

espa

n

P = 32

P = 16P = 8P = 4

Figure 16 Average makespan of DRSCRO for the random graphunder different CCRs the task number is 50

0 1 2 3 4 5 6 7 8 9134

135

136

137

138

139

140

141

Time (Ms)

Mak

espa

n

DMSCROTMSCRO

DRSCRO

times104

Figure 17 Convergence trace for the molecular dynamics codeCCR = 1 and the number of processors is 16

Time (Ms)0 1 2 3 4 5 6 7 8 9

167

168

169

170

171

172

173

174

Mak

espa

n

DMSCROTMSCRO

DRSCRO

times104

Figure 18 Convergence trace for Gaussian elimination CCR = 02and the number of processors is 8

presented in the last paragraph of Section 46 (ie DRSCROtakes the advantage of its double-reaction structure toobtain a better super molecule for accelerating convergence)Even though the VNS algorithm adds the time cost in eachiteration the enhanced optimization capability of DRSCROalso makes it obtain a better coverage rate than TMSCRO

Time (Ms)0 1 2 3 4 5 6 7

505

51

515

52

525

53

535

54

Mak

espa

n

DMSCROTMSCRO

DRSCRO

times104

Figure 19 Convergence trace for the set of the randomly generatedDAGs with 10 tasks

Time (Ms)0 2 4 6 8 10 12 14

65566

66567

67568

68569

69570

705

Mak

espa

n

DMSCROTMSCRO

DRSCRO

times104

Figure 20 Convergence trace for the set of the randomly generatedDAGs with 20 tasks

Time (Ms)0 05 1 15 2 25 3 35 4 45

179

180

181

182

183

184

185

186

Mak

espa

n

DMSCROTMSCRO

DRSCRO

times105

Figure 21 Convergence trace for the set of the randomly generatedDAGs with 50 tasks

and DMSCRO The simulation experimental results showthat DRSCRO converges faster than TMSCRO by 194 onaverage (by 293 in the best case) and faster than DMSCROby 339 on average (by 412 in the best case)

Moreover the statistical analysis based on the averagevalues achieved is also presented in Section 542 to prove

18 Mathematical Problems in Engineering

Table 12 Results of Friedman tests 120572 = 005

Method Algorithm 119901 value Hypothesis

Friedman testDRSCRO DMSCRO 253119864 minus 02 RejectedDRSCRO TMSCRO 253119864 minus 02 Rejected

DRSCRO DMSCRO TMSCRO 670119864 minus 03 Rejected

Table 13 Results of Quade tests 120572 = 005

Method Algorithm 119901 value Hypothesis

Quade testDRSCRO DMSCRO 132119864 minus 02 RejectedDRSCRO TMSCRO 132119864 minus 02 Rejected

DRSCRO DMSCRO TMSCRO 109119864 minus 03 Rejected

that DRSCRO outperforms the other CRO-based algorithmsfor DAG scheduling from a statistical point of view

542 Significant Tests Statistical analysis is necessary for theaverage coverage rates obtained in all cases by DRSCROTMSCRO and DMSCRO which are metaheuristic methodsin order to find significant differences among these resultsNonparametric tests according to the recommendationsin [40] are specifically considered to be used since theexperimental results may present neither normal distributionnor variance homogeneity Therefore the Friedman test andthe Quade test are applied to check whether significantdifferences exist in the performance between these threealgorithms A significance level 120572 = 005 is used in allstatistical tests

Tables 12 and 13 respectively list the test results of theFriedman test and the Quade test which both reject the nullhypothesis of equivalent performance In both of these twotests our proposedDRSCRO is not only compared against allthe algorithms but also compared against the remaining onesas the control methodThe results in Tables 12 and 13 validatethe significant differences 120572 in the performance of DRSCROTMSCRO and DMSCRO

In sum it could be concluded that DRSCRO which is thecontrol algorithm statistically outperforms the other CRO-based DAG scheduling algorithm on coverage rate with asignificant level of 005

6 Discussion

The experimental results of makespan tests show that theperformance of DRSCRO is very similar to the other similarkinds of metaheuristic algorithms because when averagedover all possible fitness functions each well-designed meta-heuristic algorithm has the same performance for searchingoptimal solutions according to No-Free-Lunch TheoremHowever the proposed DRSCRO can achieve better perfor-mance and find good solutions faster than the other similarkinds of metaheuristic algorithms as the experimental resultsof convergence tests and the reason for it as the analysis inthe last paragraph in Section 46 is that DRSCROhas a bettersuper molecule creation bymetaheuristic method and underthe consideration of the optimization of scheduling orderand processor assignment DRSCRO takes the advantages of

VNS algorithm in the global optimization phase to improvethe optimization capability A load balance neighborhoodstructure is also applied in the ineffective reaction operatorfor a better intensification capability The new processorselection model utilized in the neighborhood structures alsopromotes the efficiency of VNS algorithm

7 Conclusion and Future Study

An algorithm named Double-Reaction-Structured CRO(DRSCRO) is developed for DAG scheduling on heteroge-neous systems in this paper DRSCRO includes two reactionphases one for super molecule selection and another forglobal optimizationThe phase of super molecule selection isused to obtain a super molecule by the metaheuristic methodfor better convergence rate different from other CRO-basedalgorithms forDAG scheduling on heterogeneous systems Inaddition to promote the intersection capability of DRSCROthe VNS algorithm which is with a new model for processorselection utilized in the neighborhood structures is used asthe initialization of global optimization phase and the loadbalance neighborhood structure of VNS is also applied inthe ineffective reaction operator The experimental resultsshow that DRSCRO can also achieve a higher speedup thanthe other CRO-based algorithms as far as we know AndDRSCRO algorithm can also obtain better performance onaveragemakespan in some cases

In future work we will analyze the parameter sensitivityof DRSCRO for promoting its activeness Moreover to makethe proposed algorithmmore practical DRSCROwill be alsoextended to aim at twomain objectives such as (1) minimiza-tion of schedule length (time domain) and (2) minimizationof number of used processors (resource domain)

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

This work is financially supported by the National NaturalScience Foundation of China (Grant no 61462073) and

Mathematical Problems in Engineering 19

the National High Technology Research and DevelopmentProgramof China (863 Program) (Grant no 2015AA020107)

References

[1] T Lewis and H Elrewini Introduction to Parallel ComputingPrentice-Hall New York NY USA 1992

[2] Y-K Kwok and I Ahmad ldquoStatic scheduling algorithms forallocating directed task graphs to multiprocessorsrdquo ACM Com-puting Surveys vol 31 no 4 pp 406ndash471 1999

[3] H Topcuoglu S Hariri and M-Y Wu ldquoPerformance-effectiveand low-complexity task scheduling for heterogeneous comput-ingrdquo IEEE Transactions on Parallel and Distributed Systems vol13 no 3 pp 260ndash274 2002

[4] Y-K Kwok and I Ahmad ldquoDynamic critical-path schedulingan effective technique for allocating task graphs to multiproces-sorsrdquo IEEETransactions on Parallel andDistributed Systems vol7 no 5 pp 506ndash521 1996

[5] F Suter F Desprez and H Casanova ldquoFrom heterogeneoustask scheduling to heterogeneous mixed parallel schedulingrdquo inEuro-Par 2004 Parallel Processing 10th International Euro-ParConference Pisa Italy August 31- September 3 2004 Proceed-ings vol 3149 of Lecture Notes in Computer Science pp 230ndash237Springer Berlin Germany 2004

[6] C-Y Lee J J Hwang Y-C Chow and F D Anger ldquoMultipro-cessor scheduling with interprocessor communication delaysrdquoOperations Research Letters vol 7 no 3 pp 141ndash147 1988

[7] M Maheswaran and H J Siegel ldquoA dynamic matching andscheduling algorithm for heterogeneous computing systemsrdquo inProceedings of the 7th Heterogeneous Computing Workshop pp57ndash69 IEEE Computer Society Orlando Fla USAMarch 1998

[8] M-Y Wu and D D Gajski ldquoHypertool a programming aidformessage-passing systemsrdquo IEEE Transactions on Parallel andDistributed Systems vol 1 no 3 pp 330ndash343 1990

[9] H El-Rewini and T G Lewis ldquoScheduling parallel programtasks onto arbitrary target machinesrdquo Journal of Parallel andDistributed Computing vol 9 no 2 pp 138ndash153 1990

[10] G C Sih and E A Lee ldquoCompile-time scheduling heuristic forinterconnection-constrained heterogeneous processor archi-tecturesrdquo IEEE Transactions on Parallel andDistributed Systemsvol 4 no 2 pp 175ndash187 1993

[11] B Kruatrachue and T Lewis ldquoGrain size determination forparallel processingrdquo IEEE Software vol 5 no 1 pp 23ndash32 1988

[12] M A Al-Mouhamed ldquoLower bound on the number of proces-sors and time for scheduling precedence graphs with communi-cation costsrdquo IEEETransactions on Software Engineering vol 16no 12 pp 1390ndash1401 1990

[13] A Tumeo C Pilato F Ferrandi D Sciuto and P L LanzildquoAnt colony optimization for mapping and scheduling in het-erogeneous multiprocessor systemsrdquo in Proceedings of the Inter-national Conference on Embedded Computer Systems Architec-turesModeling and Simulation (SAMOS rsquo08) pp 142ndash149 IEEEComputer Society Samos Greece June 2008

[14] I Ahmad and M K Dhodhi ldquoMultiprocessor scheduling in agenetic paradigmrdquo Parallel Computing vol 22 no 3 pp 395ndash406 1996

[15] M R Bonyadi and M E Moghaddam ldquoA bipartite geneticalgorithm for multi-processor task schedulingrdquo InternationalJournal of Parallel Programming vol 37 no 5 pp 462ndash487 2009

[16] R C Correa A Ferreira and P Rebreyend ldquoScheduling multi-processor tasks with genetic algorithmsrdquo IEEE Transactions onParallel andDistributed Systems vol 10 no 8 pp 825ndash837 1999

[17] E S H Hou N Ansari and H Ren ldquoGenetic algorithm formultiprocessor schedulingrdquo IEEE Transactions on Parallel andDistributed Systems vol 5 no 2 pp 113ndash120 1994

[18] F A Omara and M M Arafa ldquoGenetic algorithms for taskscheduling problemrdquo Journal of Parallel and Distributed Com-puting vol 70 no 1 pp 13ndash22 2010

[19] A Singh M Sevaux and A Rossi ldquoA hybrid grouping geneticalgorithm for multiprocessor schedulingrdquo in ContemporaryComputing Second International Conference IC3 2009 NoidaIndia August 17ndash19 2009 Proceedings vol 40 of Communica-tions in Computer and Information Science pp 1ndash7 SpringerBerlin Germany 2009

[20] A S Wu H Yu S Jin K-C Lin and G Schiavone ldquoAnincremental genetic algorithm approach to multiprocessorschedulingrdquo IEEE Transactions on Parallel and DistributedSystems vol 15 no 9 pp 824ndash834 2004

[21] Y Wen H Xu and J Yang ldquoA heuristic-based hybrid genetic-variable neighborhood search algorithm for task schedulingin heterogeneous multiprocessor systemrdquo Information Sciencesvol 181 no 3 pp 567ndash581 2011

[22] R Shanmugapriya S Padmavathi and S Shalinie ldquoContentionawareness in task scheduling using Tabu searchrdquo in Proceedingsof the IEEE International Advance Computing Conference pp272ndash277 IEEE Computer Society Patiala India March 2009

[23] S C Porto and C C Ribeiro ldquoA tabu search approach totask scheduling on heterogeneous processors under precedenceconstraintsrdquo International Journal of High Speed Computing vol7 no 1 pp 45ndash72 1995

[24] A V Kalashnikov and V A Kostenko ldquoA parallel algorithm ofsimulated annealing for multiprocessor schedulingrdquo Journal ofComputer and Systems Sciences International vol 47 no 3 pp455ndash463 2008

[25] A Y S Lam and V O K Li ldquoChemical-reaction-inspiredmeta-heuristic for optimizationrdquo IEEE Transactions on EvolutionaryComputation vol 14 no 3 pp 381ndash399 2010

[26] P Choudhury R Kumar and P P Chakrabarti ldquoHybridscheduling of dynamic task graphs with selective duplicationfor multiprocessors under memory and time constraintsrdquo IEEETransactions on Parallel and Distributed Systems vol 19 no 7pp 967ndash980 2008

[27] Y Xu K Li LHe andT K Truong ldquoADAG scheduling schemeon heterogeneous computing systems using double molecularstructure-based chemical reaction optimizationrdquo Journal ofParallel andDistributed Computing vol 73 no 9 pp 1306ndash13222013

[28] J Xu and A Lam ldquoChemical reaction optimization for the gridscheduling problemrdquo in Proceedings of the IEEE InternationalConference on Communications pp 1ndash5 IEEE Computer Soci-ety Cape Town South Africa May 2010

[29] B Varghese G McKee and V Alexandrov ldquoCan agent intel-ligence be used to achieve fault tolerant parallel computingsystemsrdquo Parallel Processing Letters vol 21 no 4 pp 379ndash3962011

[30] Y Jiang Z Shao and Y Guo ldquoA DAG scheduling scheme onheterogeneous computing systems using tuple-based chemicalreaction optimizationrdquo Scientific World Journal vol 2014 Arti-cle ID 404375 23 pages 2014

[31] J Xu AY S Lam and V O K Li ldquoStock portfolio selectionusing chemical reaction optimizationrdquo in Proceedings of theInternational Conference on Operations Research and FinancialEngineering pp 458ndash463 WASET Paris France June 2011

20 Mathematical Problems in Engineering

[32] N Mladenovic and P Hansen ldquoVariable neighborhood searchrdquoComputers amp Operations Research vol 24 no 11 pp 1097ndash11001997

[33] K Li X Tang and K Li ldquoEnergy-efficient stochastic taskscheduling on heterogeneous computing systemsrdquo IEEE Trans-actions on Parallel and Distributed Systems vol 25 no 11 pp2867ndash2876 2014

[34] D HWolpert andWGMacready ldquoNo free lunch theorems foroptimizationrdquo IEEE Transactions on Evolutionary Computationvol 1 no 1 pp 67ndash82 1997

[35] M A Khan ldquoScheduling for heterogeneous Systems usingconstrained critical pathsrdquo Parallel Computing vol 38 no 4-5pp 175ndash193 2012

[36] P Hansen N Mladenovic and J M Perez ldquoVariable neigh-borhood search methods and applicationsrdquo 4OR-A QuarterlyJournal of Operations Research vol 6 no 4 pp 319ndash360 2008

[37] F Glover and G Kochenberger Handbook of MetaheuristicsSpringer New York NY USA 2003

[38] S Kim and J Browne ldquoA general approach to mapping of par-allel computation upon multiprocessor architecturesrdquo in Pro-ceedings of the International Conference on Parallel Processingpp 1ndash8 Pennsylvania State University Pennsylvania State Uni-versity Press University Park Pa USA August 1988

[39] V A F Almeida I M M Vasconcelos J N C Arabe and D AMenasce ldquoUsing random task graphs to investigate the potentialbenefits of heterogeneity in parallel systemsrdquo in Proceedingsof the ACMIEEE Conference on Supercomputing pp 683ndash691IEEE Computer Society Minneapolis Minn USA November1992

[40] S Garcıa A Fernandez J Luengo and F Herrera ldquoAdvancednonparametric tests for multiple comparisons in the design ofexperiments in computational intelligence and data miningexperimental analysis of powerrdquo Information Sciences vol 180no 10 pp 2044ndash2064 2010

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 16: Research Article DRSCRO: A Metaheuristic Algorithm for ...downloads.hindawi.com/journals/mpe/2015/396582.pdf · Research Article DRSCRO: A Metaheuristic Algorithm for Task Scheduling

16 Mathematical Problems in Engineering

Table 9 Experiment results for random graphs under different processor numbers task number is 50 and CCR = 02

The number ofprocessors

HEFT(averagemakespan)

DMSCRO(averagemakespan)

TMSCRO(averagemakespan)

DRSCRO(averagemakespan)

DRSCRO(best

makespan)

DRSCRO(worst

makespan)

DRSCRO(variance)

4 149400 146337 145552 145261 142283 148782 30948 119511 117061 116433 116200 113818 119017 247516 119473 117024 116396 116163 113782 118979 247432 119468 115550 114929 114700 112348 117480 2443

Table 10 Experiment results for random graphs under different task numbers processors number is 32 and CCR = 10

The numberof tasks

HEFT(averagemakespan)

DMSCRO(averagemakespan)

TMSCRO(averagemakespan)

DRSCRO(averagemakespan)

DRSCRO(best

makespan)

DRSCRO(worst

makespan)

DRSCRO(variance)

10 714484 706149 699100 690708 688982 693638 202520 1100918 1089685 1080428 1069082 1066410 1071479 261950 1666373 1662543 1649988 1634230 1633415 1636261 1165

Table 11 Experiment results of DRSCRO for the random graph under different CCRs the task number is 50

Value of CCR Processors number is 4 Processors number is 8 Processors number is 16 Processors number is 3201 145093 116068 116058 11459202 145261 116200 116163 1147001 153248 120228 119511 1185342 194401 166014 164246 1622925 508759 498427 492049 487830

0

50

100

150

CCR = 02

Mak

espa

n

HEFTDMSCRO

TMSCRODRSCRO

|P| = 4 |P| = 8 |P| = 16 |P| = 32

Figure 14 Average makespan for random graphs under differentprocessor numbers task number is 50 and CCR = 02

among DRSCRO TMSCRO and DMSCRO The conver-gence traces and significant tests are to further reveal thedifferences between DRSCRO and the other two algorithmsIn these experiments as suggested in [27] the stoppingcriteria of these three algorithms are that the total runningtime reaches a setting value (eg 180 s) Under the consid-eration of comparability the beginning of the time countingof DRSCRO is set as the start of the global optimizationphase processing In the first extensive test bed themakespanis averaged over 10 independent runs while in the secondextensive test bed the makespan is averaged over 30 differentrandom graph running instances

10 20 500

200400600800

10001200140016001800

The number of tasks

Mak

espa

n

HEFTDMSCRO

TMSCRODRSCRO

Figure 15 Average makespan for random graphs under differenttask numbers processors number is 32 and CCR = 10

541 Convergence Trace The convergence traces ofDRSCRO TMSCRO and DMSCRO for processing themolecular dynamics code and Gaussian elimination areplotted in Figures 17 and 18 respectively Figures 19ndash21show the convergence traces when processing the randomlygenerated DAG sets of which each contains 10 20 and50 tasks respectively As shown in Figures 17ndash21 it canbe observed that the convergence traces of these threealgorithms have obvious differences And the DRSCROconverges faster than the other two algorithms in every caseThe reason for the better rate of convergence of DRSCRO is as

Mathematical Problems in Engineering 17

01 02 1 2 50

100

200

300

400

500

600

CCR

Mak

espa

n

P = 32

P = 16P = 8P = 4

Figure 16 Average makespan of DRSCRO for the random graphunder different CCRs the task number is 50

0 1 2 3 4 5 6 7 8 9134

135

136

137

138

139

140

141

Time (Ms)

Mak

espa

n

DMSCROTMSCRO

DRSCRO

times104

Figure 17 Convergence trace for the molecular dynamics codeCCR = 1 and the number of processors is 16

Time (Ms)0 1 2 3 4 5 6 7 8 9

167

168

169

170

171

172

173

174

Mak

espa

n

DMSCROTMSCRO

DRSCRO

times104

Figure 18 Convergence trace for Gaussian elimination CCR = 02and the number of processors is 8

presented in the last paragraph of Section 46 (ie DRSCROtakes the advantage of its double-reaction structure toobtain a better super molecule for accelerating convergence)Even though the VNS algorithm adds the time cost in eachiteration the enhanced optimization capability of DRSCROalso makes it obtain a better coverage rate than TMSCRO

Time (Ms)0 1 2 3 4 5 6 7

505

51

515

52

525

53

535

54

Mak

espa

n

DMSCROTMSCRO

DRSCRO

times104

Figure 19 Convergence trace for the set of the randomly generatedDAGs with 10 tasks

Time (Ms)0 2 4 6 8 10 12 14

65566

66567

67568

68569

69570

705

Mak

espa

n

DMSCROTMSCRO

DRSCRO

times104

Figure 20 Convergence trace for the set of the randomly generatedDAGs with 20 tasks

Time (Ms)0 05 1 15 2 25 3 35 4 45

179

180

181

182

183

184

185

186

Mak

espa

n

DMSCROTMSCRO

DRSCRO

times105

Figure 21 Convergence trace for the set of the randomly generatedDAGs with 50 tasks

and DMSCRO The simulation experimental results showthat DRSCRO converges faster than TMSCRO by 194 onaverage (by 293 in the best case) and faster than DMSCROby 339 on average (by 412 in the best case)

Moreover the statistical analysis based on the averagevalues achieved is also presented in Section 542 to prove

18 Mathematical Problems in Engineering

Table 12 Results of Friedman tests 120572 = 005

Method Algorithm 119901 value Hypothesis

Friedman testDRSCRO DMSCRO 253119864 minus 02 RejectedDRSCRO TMSCRO 253119864 minus 02 Rejected

DRSCRO DMSCRO TMSCRO 670119864 minus 03 Rejected

Table 13 Results of Quade tests 120572 = 005

Method Algorithm 119901 value Hypothesis

Quade testDRSCRO DMSCRO 132119864 minus 02 RejectedDRSCRO TMSCRO 132119864 minus 02 Rejected

DRSCRO DMSCRO TMSCRO 109119864 minus 03 Rejected

that DRSCRO outperforms the other CRO-based algorithmsfor DAG scheduling from a statistical point of view

542 Significant Tests Statistical analysis is necessary for theaverage coverage rates obtained in all cases by DRSCROTMSCRO and DMSCRO which are metaheuristic methodsin order to find significant differences among these resultsNonparametric tests according to the recommendationsin [40] are specifically considered to be used since theexperimental results may present neither normal distributionnor variance homogeneity Therefore the Friedman test andthe Quade test are applied to check whether significantdifferences exist in the performance between these threealgorithms A significance level 120572 = 005 is used in allstatistical tests

Tables 12 and 13 respectively list the test results of theFriedman test and the Quade test which both reject the nullhypothesis of equivalent performance In both of these twotests our proposedDRSCRO is not only compared against allthe algorithms but also compared against the remaining onesas the control methodThe results in Tables 12 and 13 validatethe significant differences 120572 in the performance of DRSCROTMSCRO and DMSCRO

In sum it could be concluded that DRSCRO which is thecontrol algorithm statistically outperforms the other CRO-based DAG scheduling algorithm on coverage rate with asignificant level of 005

6 Discussion

The experimental results of makespan tests show that theperformance of DRSCRO is very similar to the other similarkinds of metaheuristic algorithms because when averagedover all possible fitness functions each well-designed meta-heuristic algorithm has the same performance for searchingoptimal solutions according to No-Free-Lunch TheoremHowever the proposed DRSCRO can achieve better perfor-mance and find good solutions faster than the other similarkinds of metaheuristic algorithms as the experimental resultsof convergence tests and the reason for it as the analysis inthe last paragraph in Section 46 is that DRSCROhas a bettersuper molecule creation bymetaheuristic method and underthe consideration of the optimization of scheduling orderand processor assignment DRSCRO takes the advantages of

VNS algorithm in the global optimization phase to improvethe optimization capability A load balance neighborhoodstructure is also applied in the ineffective reaction operatorfor a better intensification capability The new processorselection model utilized in the neighborhood structures alsopromotes the efficiency of VNS algorithm

7 Conclusion and Future Study

An algorithm named Double-Reaction-Structured CRO(DRSCRO) is developed for DAG scheduling on heteroge-neous systems in this paper DRSCRO includes two reactionphases one for super molecule selection and another forglobal optimizationThe phase of super molecule selection isused to obtain a super molecule by the metaheuristic methodfor better convergence rate different from other CRO-basedalgorithms forDAG scheduling on heterogeneous systems Inaddition to promote the intersection capability of DRSCROthe VNS algorithm which is with a new model for processorselection utilized in the neighborhood structures is used asthe initialization of global optimization phase and the loadbalance neighborhood structure of VNS is also applied inthe ineffective reaction operator The experimental resultsshow that DRSCRO can also achieve a higher speedup thanthe other CRO-based algorithms as far as we know AndDRSCRO algorithm can also obtain better performance onaveragemakespan in some cases

In future work we will analyze the parameter sensitivityof DRSCRO for promoting its activeness Moreover to makethe proposed algorithmmore practical DRSCROwill be alsoextended to aim at twomain objectives such as (1) minimiza-tion of schedule length (time domain) and (2) minimizationof number of used processors (resource domain)

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

This work is financially supported by the National NaturalScience Foundation of China (Grant no 61462073) and

Mathematical Problems in Engineering 19

the National High Technology Research and DevelopmentProgramof China (863 Program) (Grant no 2015AA020107)

References

[1] T Lewis and H Elrewini Introduction to Parallel ComputingPrentice-Hall New York NY USA 1992

[2] Y-K Kwok and I Ahmad ldquoStatic scheduling algorithms forallocating directed task graphs to multiprocessorsrdquo ACM Com-puting Surveys vol 31 no 4 pp 406ndash471 1999

[3] H Topcuoglu S Hariri and M-Y Wu ldquoPerformance-effectiveand low-complexity task scheduling for heterogeneous comput-ingrdquo IEEE Transactions on Parallel and Distributed Systems vol13 no 3 pp 260ndash274 2002

[4] Y-K Kwok and I Ahmad ldquoDynamic critical-path schedulingan effective technique for allocating task graphs to multiproces-sorsrdquo IEEETransactions on Parallel andDistributed Systems vol7 no 5 pp 506ndash521 1996

[5] F Suter F Desprez and H Casanova ldquoFrom heterogeneoustask scheduling to heterogeneous mixed parallel schedulingrdquo inEuro-Par 2004 Parallel Processing 10th International Euro-ParConference Pisa Italy August 31- September 3 2004 Proceed-ings vol 3149 of Lecture Notes in Computer Science pp 230ndash237Springer Berlin Germany 2004

[6] C-Y Lee J J Hwang Y-C Chow and F D Anger ldquoMultipro-cessor scheduling with interprocessor communication delaysrdquoOperations Research Letters vol 7 no 3 pp 141ndash147 1988

[7] M Maheswaran and H J Siegel ldquoA dynamic matching andscheduling algorithm for heterogeneous computing systemsrdquo inProceedings of the 7th Heterogeneous Computing Workshop pp57ndash69 IEEE Computer Society Orlando Fla USAMarch 1998

[8] M-Y Wu and D D Gajski ldquoHypertool a programming aidformessage-passing systemsrdquo IEEE Transactions on Parallel andDistributed Systems vol 1 no 3 pp 330ndash343 1990

[9] H El-Rewini and T G Lewis ldquoScheduling parallel programtasks onto arbitrary target machinesrdquo Journal of Parallel andDistributed Computing vol 9 no 2 pp 138ndash153 1990

[10] G C Sih and E A Lee ldquoCompile-time scheduling heuristic forinterconnection-constrained heterogeneous processor archi-tecturesrdquo IEEE Transactions on Parallel andDistributed Systemsvol 4 no 2 pp 175ndash187 1993

[11] B Kruatrachue and T Lewis ldquoGrain size determination forparallel processingrdquo IEEE Software vol 5 no 1 pp 23ndash32 1988

[12] M A Al-Mouhamed ldquoLower bound on the number of proces-sors and time for scheduling precedence graphs with communi-cation costsrdquo IEEETransactions on Software Engineering vol 16no 12 pp 1390ndash1401 1990

[13] A Tumeo C Pilato F Ferrandi D Sciuto and P L LanzildquoAnt colony optimization for mapping and scheduling in het-erogeneous multiprocessor systemsrdquo in Proceedings of the Inter-national Conference on Embedded Computer Systems Architec-turesModeling and Simulation (SAMOS rsquo08) pp 142ndash149 IEEEComputer Society Samos Greece June 2008

[14] I Ahmad and M K Dhodhi ldquoMultiprocessor scheduling in agenetic paradigmrdquo Parallel Computing vol 22 no 3 pp 395ndash406 1996

[15] M R Bonyadi and M E Moghaddam ldquoA bipartite geneticalgorithm for multi-processor task schedulingrdquo InternationalJournal of Parallel Programming vol 37 no 5 pp 462ndash487 2009

[16] R C Correa A Ferreira and P Rebreyend ldquoScheduling multi-processor tasks with genetic algorithmsrdquo IEEE Transactions onParallel andDistributed Systems vol 10 no 8 pp 825ndash837 1999

[17] E S H Hou N Ansari and H Ren ldquoGenetic algorithm formultiprocessor schedulingrdquo IEEE Transactions on Parallel andDistributed Systems vol 5 no 2 pp 113ndash120 1994

[18] F A Omara and M M Arafa ldquoGenetic algorithms for taskscheduling problemrdquo Journal of Parallel and Distributed Com-puting vol 70 no 1 pp 13ndash22 2010

[19] A Singh M Sevaux and A Rossi ldquoA hybrid grouping geneticalgorithm for multiprocessor schedulingrdquo in ContemporaryComputing Second International Conference IC3 2009 NoidaIndia August 17ndash19 2009 Proceedings vol 40 of Communica-tions in Computer and Information Science pp 1ndash7 SpringerBerlin Germany 2009

[20] A S Wu H Yu S Jin K-C Lin and G Schiavone ldquoAnincremental genetic algorithm approach to multiprocessorschedulingrdquo IEEE Transactions on Parallel and DistributedSystems vol 15 no 9 pp 824ndash834 2004

[21] Y Wen H Xu and J Yang ldquoA heuristic-based hybrid genetic-variable neighborhood search algorithm for task schedulingin heterogeneous multiprocessor systemrdquo Information Sciencesvol 181 no 3 pp 567ndash581 2011

[22] R Shanmugapriya S Padmavathi and S Shalinie ldquoContentionawareness in task scheduling using Tabu searchrdquo in Proceedingsof the IEEE International Advance Computing Conference pp272ndash277 IEEE Computer Society Patiala India March 2009

[23] S C Porto and C C Ribeiro ldquoA tabu search approach totask scheduling on heterogeneous processors under precedenceconstraintsrdquo International Journal of High Speed Computing vol7 no 1 pp 45ndash72 1995

[24] A V Kalashnikov and V A Kostenko ldquoA parallel algorithm ofsimulated annealing for multiprocessor schedulingrdquo Journal ofComputer and Systems Sciences International vol 47 no 3 pp455ndash463 2008

[25] A Y S Lam and V O K Li ldquoChemical-reaction-inspiredmeta-heuristic for optimizationrdquo IEEE Transactions on EvolutionaryComputation vol 14 no 3 pp 381ndash399 2010

[26] P Choudhury R Kumar and P P Chakrabarti ldquoHybridscheduling of dynamic task graphs with selective duplicationfor multiprocessors under memory and time constraintsrdquo IEEETransactions on Parallel and Distributed Systems vol 19 no 7pp 967ndash980 2008

[27] Y Xu K Li LHe andT K Truong ldquoADAG scheduling schemeon heterogeneous computing systems using double molecularstructure-based chemical reaction optimizationrdquo Journal ofParallel andDistributed Computing vol 73 no 9 pp 1306ndash13222013

[28] J Xu and A Lam ldquoChemical reaction optimization for the gridscheduling problemrdquo in Proceedings of the IEEE InternationalConference on Communications pp 1ndash5 IEEE Computer Soci-ety Cape Town South Africa May 2010

[29] B Varghese G McKee and V Alexandrov ldquoCan agent intel-ligence be used to achieve fault tolerant parallel computingsystemsrdquo Parallel Processing Letters vol 21 no 4 pp 379ndash3962011

[30] Y Jiang Z Shao and Y Guo ldquoA DAG scheduling scheme onheterogeneous computing systems using tuple-based chemicalreaction optimizationrdquo Scientific World Journal vol 2014 Arti-cle ID 404375 23 pages 2014

[31] J Xu AY S Lam and V O K Li ldquoStock portfolio selectionusing chemical reaction optimizationrdquo in Proceedings of theInternational Conference on Operations Research and FinancialEngineering pp 458ndash463 WASET Paris France June 2011

20 Mathematical Problems in Engineering

[32] N Mladenovic and P Hansen ldquoVariable neighborhood searchrdquoComputers amp Operations Research vol 24 no 11 pp 1097ndash11001997

[33] K Li X Tang and K Li ldquoEnergy-efficient stochastic taskscheduling on heterogeneous computing systemsrdquo IEEE Trans-actions on Parallel and Distributed Systems vol 25 no 11 pp2867ndash2876 2014

[34] D HWolpert andWGMacready ldquoNo free lunch theorems foroptimizationrdquo IEEE Transactions on Evolutionary Computationvol 1 no 1 pp 67ndash82 1997

[35] M A Khan ldquoScheduling for heterogeneous Systems usingconstrained critical pathsrdquo Parallel Computing vol 38 no 4-5pp 175ndash193 2012

[36] P Hansen N Mladenovic and J M Perez ldquoVariable neigh-borhood search methods and applicationsrdquo 4OR-A QuarterlyJournal of Operations Research vol 6 no 4 pp 319ndash360 2008

[37] F Glover and G Kochenberger Handbook of MetaheuristicsSpringer New York NY USA 2003

[38] S Kim and J Browne ldquoA general approach to mapping of par-allel computation upon multiprocessor architecturesrdquo in Pro-ceedings of the International Conference on Parallel Processingpp 1ndash8 Pennsylvania State University Pennsylvania State Uni-versity Press University Park Pa USA August 1988

[39] V A F Almeida I M M Vasconcelos J N C Arabe and D AMenasce ldquoUsing random task graphs to investigate the potentialbenefits of heterogeneity in parallel systemsrdquo in Proceedingsof the ACMIEEE Conference on Supercomputing pp 683ndash691IEEE Computer Society Minneapolis Minn USA November1992

[40] S Garcıa A Fernandez J Luengo and F Herrera ldquoAdvancednonparametric tests for multiple comparisons in the design ofexperiments in computational intelligence and data miningexperimental analysis of powerrdquo Information Sciences vol 180no 10 pp 2044ndash2064 2010

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 17: Research Article DRSCRO: A Metaheuristic Algorithm for ...downloads.hindawi.com/journals/mpe/2015/396582.pdf · Research Article DRSCRO: A Metaheuristic Algorithm for Task Scheduling

Mathematical Problems in Engineering 17

01 02 1 2 50

100

200

300

400

500

600

CCR

Mak

espa

n

P = 32

P = 16P = 8P = 4

Figure 16 Average makespan of DRSCRO for the random graphunder different CCRs the task number is 50

0 1 2 3 4 5 6 7 8 9134

135

136

137

138

139

140

141

Time (Ms)

Mak

espa

n

DMSCROTMSCRO

DRSCRO

times104

Figure 17 Convergence trace for the molecular dynamics codeCCR = 1 and the number of processors is 16

Time (Ms)0 1 2 3 4 5 6 7 8 9

167

168

169

170

171

172

173

174

Mak

espa

n

DMSCROTMSCRO

DRSCRO

times104

Figure 18 Convergence trace for Gaussian elimination CCR = 02and the number of processors is 8

presented in the last paragraph of Section 46 (ie DRSCROtakes the advantage of its double-reaction structure toobtain a better super molecule for accelerating convergence)Even though the VNS algorithm adds the time cost in eachiteration the enhanced optimization capability of DRSCROalso makes it obtain a better coverage rate than TMSCRO

Time (Ms)0 1 2 3 4 5 6 7

505

51

515

52

525

53

535

54

Mak

espa

n

DMSCROTMSCRO

DRSCRO

times104

Figure 19 Convergence trace for the set of the randomly generatedDAGs with 10 tasks

Time (Ms)0 2 4 6 8 10 12 14

65566

66567

67568

68569

69570

705

Mak

espa

n

DMSCROTMSCRO

DRSCRO

times104

Figure 20 Convergence trace for the set of the randomly generatedDAGs with 20 tasks

Time (Ms)0 05 1 15 2 25 3 35 4 45

179

180

181

182

183

184

185

186

Mak

espa

n

DMSCROTMSCRO

DRSCRO

times105

Figure 21 Convergence trace for the set of the randomly generatedDAGs with 50 tasks

and DMSCRO The simulation experimental results showthat DRSCRO converges faster than TMSCRO by 194 onaverage (by 293 in the best case) and faster than DMSCROby 339 on average (by 412 in the best case)

Moreover the statistical analysis based on the averagevalues achieved is also presented in Section 542 to prove

18 Mathematical Problems in Engineering

Table 12 Results of Friedman tests 120572 = 005

Method Algorithm 119901 value Hypothesis

Friedman testDRSCRO DMSCRO 253119864 minus 02 RejectedDRSCRO TMSCRO 253119864 minus 02 Rejected

DRSCRO DMSCRO TMSCRO 670119864 minus 03 Rejected

Table 13 Results of Quade tests 120572 = 005

Method Algorithm 119901 value Hypothesis

Quade testDRSCRO DMSCRO 132119864 minus 02 RejectedDRSCRO TMSCRO 132119864 minus 02 Rejected

DRSCRO DMSCRO TMSCRO 109119864 minus 03 Rejected

that DRSCRO outperforms the other CRO-based algorithmsfor DAG scheduling from a statistical point of view

542 Significant Tests Statistical analysis is necessary for theaverage coverage rates obtained in all cases by DRSCROTMSCRO and DMSCRO which are metaheuristic methodsin order to find significant differences among these resultsNonparametric tests according to the recommendationsin [40] are specifically considered to be used since theexperimental results may present neither normal distributionnor variance homogeneity Therefore the Friedman test andthe Quade test are applied to check whether significantdifferences exist in the performance between these threealgorithms A significance level 120572 = 005 is used in allstatistical tests

Tables 12 and 13 respectively list the test results of theFriedman test and the Quade test which both reject the nullhypothesis of equivalent performance In both of these twotests our proposedDRSCRO is not only compared against allthe algorithms but also compared against the remaining onesas the control methodThe results in Tables 12 and 13 validatethe significant differences 120572 in the performance of DRSCROTMSCRO and DMSCRO

In sum it could be concluded that DRSCRO which is thecontrol algorithm statistically outperforms the other CRO-based DAG scheduling algorithm on coverage rate with asignificant level of 005

6 Discussion

The experimental results of makespan tests show that theperformance of DRSCRO is very similar to the other similarkinds of metaheuristic algorithms because when averagedover all possible fitness functions each well-designed meta-heuristic algorithm has the same performance for searchingoptimal solutions according to No-Free-Lunch TheoremHowever the proposed DRSCRO can achieve better perfor-mance and find good solutions faster than the other similarkinds of metaheuristic algorithms as the experimental resultsof convergence tests and the reason for it as the analysis inthe last paragraph in Section 46 is that DRSCROhas a bettersuper molecule creation bymetaheuristic method and underthe consideration of the optimization of scheduling orderand processor assignment DRSCRO takes the advantages of

VNS algorithm in the global optimization phase to improvethe optimization capability A load balance neighborhoodstructure is also applied in the ineffective reaction operatorfor a better intensification capability The new processorselection model utilized in the neighborhood structures alsopromotes the efficiency of VNS algorithm

7 Conclusion and Future Study

An algorithm named Double-Reaction-Structured CRO(DRSCRO) is developed for DAG scheduling on heteroge-neous systems in this paper DRSCRO includes two reactionphases one for super molecule selection and another forglobal optimizationThe phase of super molecule selection isused to obtain a super molecule by the metaheuristic methodfor better convergence rate different from other CRO-basedalgorithms forDAG scheduling on heterogeneous systems Inaddition to promote the intersection capability of DRSCROthe VNS algorithm which is with a new model for processorselection utilized in the neighborhood structures is used asthe initialization of global optimization phase and the loadbalance neighborhood structure of VNS is also applied inthe ineffective reaction operator The experimental resultsshow that DRSCRO can also achieve a higher speedup thanthe other CRO-based algorithms as far as we know AndDRSCRO algorithm can also obtain better performance onaveragemakespan in some cases

In future work we will analyze the parameter sensitivityof DRSCRO for promoting its activeness Moreover to makethe proposed algorithmmore practical DRSCROwill be alsoextended to aim at twomain objectives such as (1) minimiza-tion of schedule length (time domain) and (2) minimizationof number of used processors (resource domain)

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

This work is financially supported by the National NaturalScience Foundation of China (Grant no 61462073) and

Mathematical Problems in Engineering 19

the National High Technology Research and DevelopmentProgramof China (863 Program) (Grant no 2015AA020107)

References

[1] T Lewis and H Elrewini Introduction to Parallel ComputingPrentice-Hall New York NY USA 1992

[2] Y-K Kwok and I Ahmad ldquoStatic scheduling algorithms forallocating directed task graphs to multiprocessorsrdquo ACM Com-puting Surveys vol 31 no 4 pp 406ndash471 1999

[3] H Topcuoglu S Hariri and M-Y Wu ldquoPerformance-effectiveand low-complexity task scheduling for heterogeneous comput-ingrdquo IEEE Transactions on Parallel and Distributed Systems vol13 no 3 pp 260ndash274 2002

[4] Y-K Kwok and I Ahmad ldquoDynamic critical-path schedulingan effective technique for allocating task graphs to multiproces-sorsrdquo IEEETransactions on Parallel andDistributed Systems vol7 no 5 pp 506ndash521 1996

[5] F Suter F Desprez and H Casanova ldquoFrom heterogeneoustask scheduling to heterogeneous mixed parallel schedulingrdquo inEuro-Par 2004 Parallel Processing 10th International Euro-ParConference Pisa Italy August 31- September 3 2004 Proceed-ings vol 3149 of Lecture Notes in Computer Science pp 230ndash237Springer Berlin Germany 2004

[6] C-Y Lee J J Hwang Y-C Chow and F D Anger ldquoMultipro-cessor scheduling with interprocessor communication delaysrdquoOperations Research Letters vol 7 no 3 pp 141ndash147 1988

[7] M Maheswaran and H J Siegel ldquoA dynamic matching andscheduling algorithm for heterogeneous computing systemsrdquo inProceedings of the 7th Heterogeneous Computing Workshop pp57ndash69 IEEE Computer Society Orlando Fla USAMarch 1998

[8] M-Y Wu and D D Gajski ldquoHypertool a programming aidformessage-passing systemsrdquo IEEE Transactions on Parallel andDistributed Systems vol 1 no 3 pp 330ndash343 1990

[9] H El-Rewini and T G Lewis ldquoScheduling parallel programtasks onto arbitrary target machinesrdquo Journal of Parallel andDistributed Computing vol 9 no 2 pp 138ndash153 1990

[10] G C Sih and E A Lee ldquoCompile-time scheduling heuristic forinterconnection-constrained heterogeneous processor archi-tecturesrdquo IEEE Transactions on Parallel andDistributed Systemsvol 4 no 2 pp 175ndash187 1993

[11] B Kruatrachue and T Lewis ldquoGrain size determination forparallel processingrdquo IEEE Software vol 5 no 1 pp 23ndash32 1988

[12] M A Al-Mouhamed ldquoLower bound on the number of proces-sors and time for scheduling precedence graphs with communi-cation costsrdquo IEEETransactions on Software Engineering vol 16no 12 pp 1390ndash1401 1990

[13] A Tumeo C Pilato F Ferrandi D Sciuto and P L LanzildquoAnt colony optimization for mapping and scheduling in het-erogeneous multiprocessor systemsrdquo in Proceedings of the Inter-national Conference on Embedded Computer Systems Architec-turesModeling and Simulation (SAMOS rsquo08) pp 142ndash149 IEEEComputer Society Samos Greece June 2008

[14] I Ahmad and M K Dhodhi ldquoMultiprocessor scheduling in agenetic paradigmrdquo Parallel Computing vol 22 no 3 pp 395ndash406 1996

[15] M R Bonyadi and M E Moghaddam ldquoA bipartite geneticalgorithm for multi-processor task schedulingrdquo InternationalJournal of Parallel Programming vol 37 no 5 pp 462ndash487 2009

[16] R C Correa A Ferreira and P Rebreyend ldquoScheduling multi-processor tasks with genetic algorithmsrdquo IEEE Transactions onParallel andDistributed Systems vol 10 no 8 pp 825ndash837 1999

[17] E S H Hou N Ansari and H Ren ldquoGenetic algorithm formultiprocessor schedulingrdquo IEEE Transactions on Parallel andDistributed Systems vol 5 no 2 pp 113ndash120 1994

[18] F A Omara and M M Arafa ldquoGenetic algorithms for taskscheduling problemrdquo Journal of Parallel and Distributed Com-puting vol 70 no 1 pp 13ndash22 2010

[19] A Singh M Sevaux and A Rossi ldquoA hybrid grouping geneticalgorithm for multiprocessor schedulingrdquo in ContemporaryComputing Second International Conference IC3 2009 NoidaIndia August 17ndash19 2009 Proceedings vol 40 of Communica-tions in Computer and Information Science pp 1ndash7 SpringerBerlin Germany 2009

[20] A S Wu H Yu S Jin K-C Lin and G Schiavone ldquoAnincremental genetic algorithm approach to multiprocessorschedulingrdquo IEEE Transactions on Parallel and DistributedSystems vol 15 no 9 pp 824ndash834 2004

[21] Y Wen H Xu and J Yang ldquoA heuristic-based hybrid genetic-variable neighborhood search algorithm for task schedulingin heterogeneous multiprocessor systemrdquo Information Sciencesvol 181 no 3 pp 567ndash581 2011

[22] R Shanmugapriya S Padmavathi and S Shalinie ldquoContentionawareness in task scheduling using Tabu searchrdquo in Proceedingsof the IEEE International Advance Computing Conference pp272ndash277 IEEE Computer Society Patiala India March 2009

[23] S C Porto and C C Ribeiro ldquoA tabu search approach totask scheduling on heterogeneous processors under precedenceconstraintsrdquo International Journal of High Speed Computing vol7 no 1 pp 45ndash72 1995

[24] A V Kalashnikov and V A Kostenko ldquoA parallel algorithm ofsimulated annealing for multiprocessor schedulingrdquo Journal ofComputer and Systems Sciences International vol 47 no 3 pp455ndash463 2008

[25] A Y S Lam and V O K Li ldquoChemical-reaction-inspiredmeta-heuristic for optimizationrdquo IEEE Transactions on EvolutionaryComputation vol 14 no 3 pp 381ndash399 2010

[26] P Choudhury R Kumar and P P Chakrabarti ldquoHybridscheduling of dynamic task graphs with selective duplicationfor multiprocessors under memory and time constraintsrdquo IEEETransactions on Parallel and Distributed Systems vol 19 no 7pp 967ndash980 2008

[27] Y Xu K Li LHe andT K Truong ldquoADAG scheduling schemeon heterogeneous computing systems using double molecularstructure-based chemical reaction optimizationrdquo Journal ofParallel andDistributed Computing vol 73 no 9 pp 1306ndash13222013

[28] J Xu and A Lam ldquoChemical reaction optimization for the gridscheduling problemrdquo in Proceedings of the IEEE InternationalConference on Communications pp 1ndash5 IEEE Computer Soci-ety Cape Town South Africa May 2010

[29] B Varghese G McKee and V Alexandrov ldquoCan agent intel-ligence be used to achieve fault tolerant parallel computingsystemsrdquo Parallel Processing Letters vol 21 no 4 pp 379ndash3962011

[30] Y Jiang Z Shao and Y Guo ldquoA DAG scheduling scheme onheterogeneous computing systems using tuple-based chemicalreaction optimizationrdquo Scientific World Journal vol 2014 Arti-cle ID 404375 23 pages 2014

[31] J Xu AY S Lam and V O K Li ldquoStock portfolio selectionusing chemical reaction optimizationrdquo in Proceedings of theInternational Conference on Operations Research and FinancialEngineering pp 458ndash463 WASET Paris France June 2011

20 Mathematical Problems in Engineering

[32] N Mladenovic and P Hansen ldquoVariable neighborhood searchrdquoComputers amp Operations Research vol 24 no 11 pp 1097ndash11001997

[33] K Li X Tang and K Li ldquoEnergy-efficient stochastic taskscheduling on heterogeneous computing systemsrdquo IEEE Trans-actions on Parallel and Distributed Systems vol 25 no 11 pp2867ndash2876 2014

[34] D HWolpert andWGMacready ldquoNo free lunch theorems foroptimizationrdquo IEEE Transactions on Evolutionary Computationvol 1 no 1 pp 67ndash82 1997

[35] M A Khan ldquoScheduling for heterogeneous Systems usingconstrained critical pathsrdquo Parallel Computing vol 38 no 4-5pp 175ndash193 2012

[36] P Hansen N Mladenovic and J M Perez ldquoVariable neigh-borhood search methods and applicationsrdquo 4OR-A QuarterlyJournal of Operations Research vol 6 no 4 pp 319ndash360 2008

[37] F Glover and G Kochenberger Handbook of MetaheuristicsSpringer New York NY USA 2003

[38] S Kim and J Browne ldquoA general approach to mapping of par-allel computation upon multiprocessor architecturesrdquo in Pro-ceedings of the International Conference on Parallel Processingpp 1ndash8 Pennsylvania State University Pennsylvania State Uni-versity Press University Park Pa USA August 1988

[39] V A F Almeida I M M Vasconcelos J N C Arabe and D AMenasce ldquoUsing random task graphs to investigate the potentialbenefits of heterogeneity in parallel systemsrdquo in Proceedingsof the ACMIEEE Conference on Supercomputing pp 683ndash691IEEE Computer Society Minneapolis Minn USA November1992

[40] S Garcıa A Fernandez J Luengo and F Herrera ldquoAdvancednonparametric tests for multiple comparisons in the design ofexperiments in computational intelligence and data miningexperimental analysis of powerrdquo Information Sciences vol 180no 10 pp 2044ndash2064 2010

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 18: Research Article DRSCRO: A Metaheuristic Algorithm for ...downloads.hindawi.com/journals/mpe/2015/396582.pdf · Research Article DRSCRO: A Metaheuristic Algorithm for Task Scheduling

18 Mathematical Problems in Engineering

Table 12 Results of Friedman tests 120572 = 005

Method Algorithm 119901 value Hypothesis

Friedman testDRSCRO DMSCRO 253119864 minus 02 RejectedDRSCRO TMSCRO 253119864 minus 02 Rejected

DRSCRO DMSCRO TMSCRO 670119864 minus 03 Rejected

Table 13 Results of Quade tests 120572 = 005

Method Algorithm 119901 value Hypothesis

Quade testDRSCRO DMSCRO 132119864 minus 02 RejectedDRSCRO TMSCRO 132119864 minus 02 Rejected

DRSCRO DMSCRO TMSCRO 109119864 minus 03 Rejected

that DRSCRO outperforms the other CRO-based algorithmsfor DAG scheduling from a statistical point of view

542 Significant Tests Statistical analysis is necessary for theaverage coverage rates obtained in all cases by DRSCROTMSCRO and DMSCRO which are metaheuristic methodsin order to find significant differences among these resultsNonparametric tests according to the recommendationsin [40] are specifically considered to be used since theexperimental results may present neither normal distributionnor variance homogeneity Therefore the Friedman test andthe Quade test are applied to check whether significantdifferences exist in the performance between these threealgorithms A significance level 120572 = 005 is used in allstatistical tests

Tables 12 and 13 respectively list the test results of theFriedman test and the Quade test which both reject the nullhypothesis of equivalent performance In both of these twotests our proposedDRSCRO is not only compared against allthe algorithms but also compared against the remaining onesas the control methodThe results in Tables 12 and 13 validatethe significant differences 120572 in the performance of DRSCROTMSCRO and DMSCRO

In sum it could be concluded that DRSCRO which is thecontrol algorithm statistically outperforms the other CRO-based DAG scheduling algorithm on coverage rate with asignificant level of 005

6 Discussion

The experimental results of makespan tests show that theperformance of DRSCRO is very similar to the other similarkinds of metaheuristic algorithms because when averagedover all possible fitness functions each well-designed meta-heuristic algorithm has the same performance for searchingoptimal solutions according to No-Free-Lunch TheoremHowever the proposed DRSCRO can achieve better perfor-mance and find good solutions faster than the other similarkinds of metaheuristic algorithms as the experimental resultsof convergence tests and the reason for it as the analysis inthe last paragraph in Section 46 is that DRSCROhas a bettersuper molecule creation bymetaheuristic method and underthe consideration of the optimization of scheduling orderand processor assignment DRSCRO takes the advantages of

VNS algorithm in the global optimization phase to improvethe optimization capability A load balance neighborhoodstructure is also applied in the ineffective reaction operatorfor a better intensification capability The new processorselection model utilized in the neighborhood structures alsopromotes the efficiency of VNS algorithm

7 Conclusion and Future Study

An algorithm named Double-Reaction-Structured CRO(DRSCRO) is developed for DAG scheduling on heteroge-neous systems in this paper DRSCRO includes two reactionphases one for super molecule selection and another forglobal optimizationThe phase of super molecule selection isused to obtain a super molecule by the metaheuristic methodfor better convergence rate different from other CRO-basedalgorithms forDAG scheduling on heterogeneous systems Inaddition to promote the intersection capability of DRSCROthe VNS algorithm which is with a new model for processorselection utilized in the neighborhood structures is used asthe initialization of global optimization phase and the loadbalance neighborhood structure of VNS is also applied inthe ineffective reaction operator The experimental resultsshow that DRSCRO can also achieve a higher speedup thanthe other CRO-based algorithms as far as we know AndDRSCRO algorithm can also obtain better performance onaveragemakespan in some cases

In future work we will analyze the parameter sensitivityof DRSCRO for promoting its activeness Moreover to makethe proposed algorithmmore practical DRSCROwill be alsoextended to aim at twomain objectives such as (1) minimiza-tion of schedule length (time domain) and (2) minimizationof number of used processors (resource domain)

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

This work is financially supported by the National NaturalScience Foundation of China (Grant no 61462073) and

Mathematical Problems in Engineering 19

the National High Technology Research and DevelopmentProgramof China (863 Program) (Grant no 2015AA020107)

References

[1] T Lewis and H Elrewini Introduction to Parallel ComputingPrentice-Hall New York NY USA 1992

[2] Y-K Kwok and I Ahmad ldquoStatic scheduling algorithms forallocating directed task graphs to multiprocessorsrdquo ACM Com-puting Surveys vol 31 no 4 pp 406ndash471 1999

[3] H Topcuoglu S Hariri and M-Y Wu ldquoPerformance-effectiveand low-complexity task scheduling for heterogeneous comput-ingrdquo IEEE Transactions on Parallel and Distributed Systems vol13 no 3 pp 260ndash274 2002

[4] Y-K Kwok and I Ahmad ldquoDynamic critical-path schedulingan effective technique for allocating task graphs to multiproces-sorsrdquo IEEETransactions on Parallel andDistributed Systems vol7 no 5 pp 506ndash521 1996

[5] F Suter F Desprez and H Casanova ldquoFrom heterogeneoustask scheduling to heterogeneous mixed parallel schedulingrdquo inEuro-Par 2004 Parallel Processing 10th International Euro-ParConference Pisa Italy August 31- September 3 2004 Proceed-ings vol 3149 of Lecture Notes in Computer Science pp 230ndash237Springer Berlin Germany 2004

[6] C-Y Lee J J Hwang Y-C Chow and F D Anger ldquoMultipro-cessor scheduling with interprocessor communication delaysrdquoOperations Research Letters vol 7 no 3 pp 141ndash147 1988

[7] M Maheswaran and H J Siegel ldquoA dynamic matching andscheduling algorithm for heterogeneous computing systemsrdquo inProceedings of the 7th Heterogeneous Computing Workshop pp57ndash69 IEEE Computer Society Orlando Fla USAMarch 1998

[8] M-Y Wu and D D Gajski ldquoHypertool a programming aidformessage-passing systemsrdquo IEEE Transactions on Parallel andDistributed Systems vol 1 no 3 pp 330ndash343 1990

[9] H El-Rewini and T G Lewis ldquoScheduling parallel programtasks onto arbitrary target machinesrdquo Journal of Parallel andDistributed Computing vol 9 no 2 pp 138ndash153 1990

[10] G C Sih and E A Lee ldquoCompile-time scheduling heuristic forinterconnection-constrained heterogeneous processor archi-tecturesrdquo IEEE Transactions on Parallel andDistributed Systemsvol 4 no 2 pp 175ndash187 1993

[11] B Kruatrachue and T Lewis ldquoGrain size determination forparallel processingrdquo IEEE Software vol 5 no 1 pp 23ndash32 1988

[12] M A Al-Mouhamed ldquoLower bound on the number of proces-sors and time for scheduling precedence graphs with communi-cation costsrdquo IEEETransactions on Software Engineering vol 16no 12 pp 1390ndash1401 1990

[13] A Tumeo C Pilato F Ferrandi D Sciuto and P L LanzildquoAnt colony optimization for mapping and scheduling in het-erogeneous multiprocessor systemsrdquo in Proceedings of the Inter-national Conference on Embedded Computer Systems Architec-turesModeling and Simulation (SAMOS rsquo08) pp 142ndash149 IEEEComputer Society Samos Greece June 2008

[14] I Ahmad and M K Dhodhi ldquoMultiprocessor scheduling in agenetic paradigmrdquo Parallel Computing vol 22 no 3 pp 395ndash406 1996

[15] M R Bonyadi and M E Moghaddam ldquoA bipartite geneticalgorithm for multi-processor task schedulingrdquo InternationalJournal of Parallel Programming vol 37 no 5 pp 462ndash487 2009

[16] R C Correa A Ferreira and P Rebreyend ldquoScheduling multi-processor tasks with genetic algorithmsrdquo IEEE Transactions onParallel andDistributed Systems vol 10 no 8 pp 825ndash837 1999

[17] E S H Hou N Ansari and H Ren ldquoGenetic algorithm formultiprocessor schedulingrdquo IEEE Transactions on Parallel andDistributed Systems vol 5 no 2 pp 113ndash120 1994

[18] F A Omara and M M Arafa ldquoGenetic algorithms for taskscheduling problemrdquo Journal of Parallel and Distributed Com-puting vol 70 no 1 pp 13ndash22 2010

[19] A Singh M Sevaux and A Rossi ldquoA hybrid grouping geneticalgorithm for multiprocessor schedulingrdquo in ContemporaryComputing Second International Conference IC3 2009 NoidaIndia August 17ndash19 2009 Proceedings vol 40 of Communica-tions in Computer and Information Science pp 1ndash7 SpringerBerlin Germany 2009

[20] A S Wu H Yu S Jin K-C Lin and G Schiavone ldquoAnincremental genetic algorithm approach to multiprocessorschedulingrdquo IEEE Transactions on Parallel and DistributedSystems vol 15 no 9 pp 824ndash834 2004

[21] Y Wen H Xu and J Yang ldquoA heuristic-based hybrid genetic-variable neighborhood search algorithm for task schedulingin heterogeneous multiprocessor systemrdquo Information Sciencesvol 181 no 3 pp 567ndash581 2011

[22] R Shanmugapriya S Padmavathi and S Shalinie ldquoContentionawareness in task scheduling using Tabu searchrdquo in Proceedingsof the IEEE International Advance Computing Conference pp272ndash277 IEEE Computer Society Patiala India March 2009

[23] S C Porto and C C Ribeiro ldquoA tabu search approach totask scheduling on heterogeneous processors under precedenceconstraintsrdquo International Journal of High Speed Computing vol7 no 1 pp 45ndash72 1995

[24] A V Kalashnikov and V A Kostenko ldquoA parallel algorithm ofsimulated annealing for multiprocessor schedulingrdquo Journal ofComputer and Systems Sciences International vol 47 no 3 pp455ndash463 2008

[25] A Y S Lam and V O K Li ldquoChemical-reaction-inspiredmeta-heuristic for optimizationrdquo IEEE Transactions on EvolutionaryComputation vol 14 no 3 pp 381ndash399 2010

[26] P Choudhury R Kumar and P P Chakrabarti ldquoHybridscheduling of dynamic task graphs with selective duplicationfor multiprocessors under memory and time constraintsrdquo IEEETransactions on Parallel and Distributed Systems vol 19 no 7pp 967ndash980 2008

[27] Y Xu K Li LHe andT K Truong ldquoADAG scheduling schemeon heterogeneous computing systems using double molecularstructure-based chemical reaction optimizationrdquo Journal ofParallel andDistributed Computing vol 73 no 9 pp 1306ndash13222013

[28] J Xu and A Lam ldquoChemical reaction optimization for the gridscheduling problemrdquo in Proceedings of the IEEE InternationalConference on Communications pp 1ndash5 IEEE Computer Soci-ety Cape Town South Africa May 2010

[29] B Varghese G McKee and V Alexandrov ldquoCan agent intel-ligence be used to achieve fault tolerant parallel computingsystemsrdquo Parallel Processing Letters vol 21 no 4 pp 379ndash3962011

[30] Y Jiang Z Shao and Y Guo ldquoA DAG scheduling scheme onheterogeneous computing systems using tuple-based chemicalreaction optimizationrdquo Scientific World Journal vol 2014 Arti-cle ID 404375 23 pages 2014

[31] J Xu AY S Lam and V O K Li ldquoStock portfolio selectionusing chemical reaction optimizationrdquo in Proceedings of theInternational Conference on Operations Research and FinancialEngineering pp 458ndash463 WASET Paris France June 2011

20 Mathematical Problems in Engineering

[32] N Mladenovic and P Hansen ldquoVariable neighborhood searchrdquoComputers amp Operations Research vol 24 no 11 pp 1097ndash11001997

[33] K Li X Tang and K Li ldquoEnergy-efficient stochastic taskscheduling on heterogeneous computing systemsrdquo IEEE Trans-actions on Parallel and Distributed Systems vol 25 no 11 pp2867ndash2876 2014

[34] D HWolpert andWGMacready ldquoNo free lunch theorems foroptimizationrdquo IEEE Transactions on Evolutionary Computationvol 1 no 1 pp 67ndash82 1997

[35] M A Khan ldquoScheduling for heterogeneous Systems usingconstrained critical pathsrdquo Parallel Computing vol 38 no 4-5pp 175ndash193 2012

[36] P Hansen N Mladenovic and J M Perez ldquoVariable neigh-borhood search methods and applicationsrdquo 4OR-A QuarterlyJournal of Operations Research vol 6 no 4 pp 319ndash360 2008

[37] F Glover and G Kochenberger Handbook of MetaheuristicsSpringer New York NY USA 2003

[38] S Kim and J Browne ldquoA general approach to mapping of par-allel computation upon multiprocessor architecturesrdquo in Pro-ceedings of the International Conference on Parallel Processingpp 1ndash8 Pennsylvania State University Pennsylvania State Uni-versity Press University Park Pa USA August 1988

[39] V A F Almeida I M M Vasconcelos J N C Arabe and D AMenasce ldquoUsing random task graphs to investigate the potentialbenefits of heterogeneity in parallel systemsrdquo in Proceedingsof the ACMIEEE Conference on Supercomputing pp 683ndash691IEEE Computer Society Minneapolis Minn USA November1992

[40] S Garcıa A Fernandez J Luengo and F Herrera ldquoAdvancednonparametric tests for multiple comparisons in the design ofexperiments in computational intelligence and data miningexperimental analysis of powerrdquo Information Sciences vol 180no 10 pp 2044ndash2064 2010

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 19: Research Article DRSCRO: A Metaheuristic Algorithm for ...downloads.hindawi.com/journals/mpe/2015/396582.pdf · Research Article DRSCRO: A Metaheuristic Algorithm for Task Scheduling

Mathematical Problems in Engineering 19

the National High Technology Research and DevelopmentProgramof China (863 Program) (Grant no 2015AA020107)

References

[1] T Lewis and H Elrewini Introduction to Parallel ComputingPrentice-Hall New York NY USA 1992

[2] Y-K Kwok and I Ahmad ldquoStatic scheduling algorithms forallocating directed task graphs to multiprocessorsrdquo ACM Com-puting Surveys vol 31 no 4 pp 406ndash471 1999

[3] H Topcuoglu S Hariri and M-Y Wu ldquoPerformance-effectiveand low-complexity task scheduling for heterogeneous comput-ingrdquo IEEE Transactions on Parallel and Distributed Systems vol13 no 3 pp 260ndash274 2002

[4] Y-K Kwok and I Ahmad ldquoDynamic critical-path schedulingan effective technique for allocating task graphs to multiproces-sorsrdquo IEEETransactions on Parallel andDistributed Systems vol7 no 5 pp 506ndash521 1996

[5] F Suter F Desprez and H Casanova ldquoFrom heterogeneoustask scheduling to heterogeneous mixed parallel schedulingrdquo inEuro-Par 2004 Parallel Processing 10th International Euro-ParConference Pisa Italy August 31- September 3 2004 Proceed-ings vol 3149 of Lecture Notes in Computer Science pp 230ndash237Springer Berlin Germany 2004

[6] C-Y Lee J J Hwang Y-C Chow and F D Anger ldquoMultipro-cessor scheduling with interprocessor communication delaysrdquoOperations Research Letters vol 7 no 3 pp 141ndash147 1988

[7] M Maheswaran and H J Siegel ldquoA dynamic matching andscheduling algorithm for heterogeneous computing systemsrdquo inProceedings of the 7th Heterogeneous Computing Workshop pp57ndash69 IEEE Computer Society Orlando Fla USAMarch 1998

[8] M-Y Wu and D D Gajski ldquoHypertool a programming aidformessage-passing systemsrdquo IEEE Transactions on Parallel andDistributed Systems vol 1 no 3 pp 330ndash343 1990

[9] H El-Rewini and T G Lewis ldquoScheduling parallel programtasks onto arbitrary target machinesrdquo Journal of Parallel andDistributed Computing vol 9 no 2 pp 138ndash153 1990

[10] G C Sih and E A Lee ldquoCompile-time scheduling heuristic forinterconnection-constrained heterogeneous processor archi-tecturesrdquo IEEE Transactions on Parallel andDistributed Systemsvol 4 no 2 pp 175ndash187 1993

[11] B Kruatrachue and T Lewis ldquoGrain size determination forparallel processingrdquo IEEE Software vol 5 no 1 pp 23ndash32 1988

[12] M A Al-Mouhamed ldquoLower bound on the number of proces-sors and time for scheduling precedence graphs with communi-cation costsrdquo IEEETransactions on Software Engineering vol 16no 12 pp 1390ndash1401 1990

[13] A Tumeo C Pilato F Ferrandi D Sciuto and P L LanzildquoAnt colony optimization for mapping and scheduling in het-erogeneous multiprocessor systemsrdquo in Proceedings of the Inter-national Conference on Embedded Computer Systems Architec-turesModeling and Simulation (SAMOS rsquo08) pp 142ndash149 IEEEComputer Society Samos Greece June 2008

[14] I Ahmad and M K Dhodhi ldquoMultiprocessor scheduling in agenetic paradigmrdquo Parallel Computing vol 22 no 3 pp 395ndash406 1996

[15] M R Bonyadi and M E Moghaddam ldquoA bipartite geneticalgorithm for multi-processor task schedulingrdquo InternationalJournal of Parallel Programming vol 37 no 5 pp 462ndash487 2009

[16] R C Correa A Ferreira and P Rebreyend ldquoScheduling multi-processor tasks with genetic algorithmsrdquo IEEE Transactions onParallel andDistributed Systems vol 10 no 8 pp 825ndash837 1999

[17] E S H Hou N Ansari and H Ren ldquoGenetic algorithm formultiprocessor schedulingrdquo IEEE Transactions on Parallel andDistributed Systems vol 5 no 2 pp 113ndash120 1994

[18] F A Omara and M M Arafa ldquoGenetic algorithms for taskscheduling problemrdquo Journal of Parallel and Distributed Com-puting vol 70 no 1 pp 13ndash22 2010

[19] A Singh M Sevaux and A Rossi ldquoA hybrid grouping geneticalgorithm for multiprocessor schedulingrdquo in ContemporaryComputing Second International Conference IC3 2009 NoidaIndia August 17ndash19 2009 Proceedings vol 40 of Communica-tions in Computer and Information Science pp 1ndash7 SpringerBerlin Germany 2009

[20] A S Wu H Yu S Jin K-C Lin and G Schiavone ldquoAnincremental genetic algorithm approach to multiprocessorschedulingrdquo IEEE Transactions on Parallel and DistributedSystems vol 15 no 9 pp 824ndash834 2004

[21] Y Wen H Xu and J Yang ldquoA heuristic-based hybrid genetic-variable neighborhood search algorithm for task schedulingin heterogeneous multiprocessor systemrdquo Information Sciencesvol 181 no 3 pp 567ndash581 2011

[22] R Shanmugapriya S Padmavathi and S Shalinie ldquoContentionawareness in task scheduling using Tabu searchrdquo in Proceedingsof the IEEE International Advance Computing Conference pp272ndash277 IEEE Computer Society Patiala India March 2009

[23] S C Porto and C C Ribeiro ldquoA tabu search approach totask scheduling on heterogeneous processors under precedenceconstraintsrdquo International Journal of High Speed Computing vol7 no 1 pp 45ndash72 1995

[24] A V Kalashnikov and V A Kostenko ldquoA parallel algorithm ofsimulated annealing for multiprocessor schedulingrdquo Journal ofComputer and Systems Sciences International vol 47 no 3 pp455ndash463 2008

[25] A Y S Lam and V O K Li ldquoChemical-reaction-inspiredmeta-heuristic for optimizationrdquo IEEE Transactions on EvolutionaryComputation vol 14 no 3 pp 381ndash399 2010

[26] P Choudhury R Kumar and P P Chakrabarti ldquoHybridscheduling of dynamic task graphs with selective duplicationfor multiprocessors under memory and time constraintsrdquo IEEETransactions on Parallel and Distributed Systems vol 19 no 7pp 967ndash980 2008

[27] Y Xu K Li LHe andT K Truong ldquoADAG scheduling schemeon heterogeneous computing systems using double molecularstructure-based chemical reaction optimizationrdquo Journal ofParallel andDistributed Computing vol 73 no 9 pp 1306ndash13222013

[28] J Xu and A Lam ldquoChemical reaction optimization for the gridscheduling problemrdquo in Proceedings of the IEEE InternationalConference on Communications pp 1ndash5 IEEE Computer Soci-ety Cape Town South Africa May 2010

[29] B Varghese G McKee and V Alexandrov ldquoCan agent intel-ligence be used to achieve fault tolerant parallel computingsystemsrdquo Parallel Processing Letters vol 21 no 4 pp 379ndash3962011

[30] Y Jiang Z Shao and Y Guo ldquoA DAG scheduling scheme onheterogeneous computing systems using tuple-based chemicalreaction optimizationrdquo Scientific World Journal vol 2014 Arti-cle ID 404375 23 pages 2014

[31] J Xu AY S Lam and V O K Li ldquoStock portfolio selectionusing chemical reaction optimizationrdquo in Proceedings of theInternational Conference on Operations Research and FinancialEngineering pp 458ndash463 WASET Paris France June 2011

20 Mathematical Problems in Engineering

[32] N Mladenovic and P Hansen ldquoVariable neighborhood searchrdquoComputers amp Operations Research vol 24 no 11 pp 1097ndash11001997

[33] K Li X Tang and K Li ldquoEnergy-efficient stochastic taskscheduling on heterogeneous computing systemsrdquo IEEE Trans-actions on Parallel and Distributed Systems vol 25 no 11 pp2867ndash2876 2014

[34] D HWolpert andWGMacready ldquoNo free lunch theorems foroptimizationrdquo IEEE Transactions on Evolutionary Computationvol 1 no 1 pp 67ndash82 1997

[35] M A Khan ldquoScheduling for heterogeneous Systems usingconstrained critical pathsrdquo Parallel Computing vol 38 no 4-5pp 175ndash193 2012

[36] P Hansen N Mladenovic and J M Perez ldquoVariable neigh-borhood search methods and applicationsrdquo 4OR-A QuarterlyJournal of Operations Research vol 6 no 4 pp 319ndash360 2008

[37] F Glover and G Kochenberger Handbook of MetaheuristicsSpringer New York NY USA 2003

[38] S Kim and J Browne ldquoA general approach to mapping of par-allel computation upon multiprocessor architecturesrdquo in Pro-ceedings of the International Conference on Parallel Processingpp 1ndash8 Pennsylvania State University Pennsylvania State Uni-versity Press University Park Pa USA August 1988

[39] V A F Almeida I M M Vasconcelos J N C Arabe and D AMenasce ldquoUsing random task graphs to investigate the potentialbenefits of heterogeneity in parallel systemsrdquo in Proceedingsof the ACMIEEE Conference on Supercomputing pp 683ndash691IEEE Computer Society Minneapolis Minn USA November1992

[40] S Garcıa A Fernandez J Luengo and F Herrera ldquoAdvancednonparametric tests for multiple comparisons in the design ofexperiments in computational intelligence and data miningexperimental analysis of powerrdquo Information Sciences vol 180no 10 pp 2044ndash2064 2010

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 20: Research Article DRSCRO: A Metaheuristic Algorithm for ...downloads.hindawi.com/journals/mpe/2015/396582.pdf · Research Article DRSCRO: A Metaheuristic Algorithm for Task Scheduling

20 Mathematical Problems in Engineering

[32] N Mladenovic and P Hansen ldquoVariable neighborhood searchrdquoComputers amp Operations Research vol 24 no 11 pp 1097ndash11001997

[33] K Li X Tang and K Li ldquoEnergy-efficient stochastic taskscheduling on heterogeneous computing systemsrdquo IEEE Trans-actions on Parallel and Distributed Systems vol 25 no 11 pp2867ndash2876 2014

[34] D HWolpert andWGMacready ldquoNo free lunch theorems foroptimizationrdquo IEEE Transactions on Evolutionary Computationvol 1 no 1 pp 67ndash82 1997

[35] M A Khan ldquoScheduling for heterogeneous Systems usingconstrained critical pathsrdquo Parallel Computing vol 38 no 4-5pp 175ndash193 2012

[36] P Hansen N Mladenovic and J M Perez ldquoVariable neigh-borhood search methods and applicationsrdquo 4OR-A QuarterlyJournal of Operations Research vol 6 no 4 pp 319ndash360 2008

[37] F Glover and G Kochenberger Handbook of MetaheuristicsSpringer New York NY USA 2003

[38] S Kim and J Browne ldquoA general approach to mapping of par-allel computation upon multiprocessor architecturesrdquo in Pro-ceedings of the International Conference on Parallel Processingpp 1ndash8 Pennsylvania State University Pennsylvania State Uni-versity Press University Park Pa USA August 1988

[39] V A F Almeida I M M Vasconcelos J N C Arabe and D AMenasce ldquoUsing random task graphs to investigate the potentialbenefits of heterogeneity in parallel systemsrdquo in Proceedingsof the ACMIEEE Conference on Supercomputing pp 683ndash691IEEE Computer Society Minneapolis Minn USA November1992

[40] S Garcıa A Fernandez J Luengo and F Herrera ldquoAdvancednonparametric tests for multiple comparisons in the design ofexperiments in computational intelligence and data miningexperimental analysis of powerrdquo Information Sciences vol 180no 10 pp 2044ndash2064 2010

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 21: Research Article DRSCRO: A Metaheuristic Algorithm for ...downloads.hindawi.com/journals/mpe/2015/396582.pdf · Research Article DRSCRO: A Metaheuristic Algorithm for Task Scheduling

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of