dynamic task offloading and scheduling for low...

668 IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 37, NO. 3, MARCH 2019

Dynamic Task Offloading and Scheduling forLow-Latency IoT Services in Multi-Access

Edge ComputingHyame Assem Alameddine, Sanaa Sharafeddine , Samir Sebbah, Sara Ayoubi,

and Chadi Assi, Senior Member, IEEE

Abstract— Multi-access edge computing (MEC) has recentlyemerged as a novel paradigm to facilitate access to advancedcomputing capabilities at the edge of the network, in closeproximity to end devices, thereby enabling a rich variety oflatency sensitive services demanded by various emerging industryverticals. Internet-of-Things (IoT) devices, being highly ubiq-uitous and connected, can offload their computational tasksto be processed by applications hosted on the MEC serversdue to their limited battery, computing, and storage capacities.Such IoT applications providing services to offloaded tasks ofIoT devices are hosted on edge servers with limited computingcapabilities. Given the heterogeneity in the requirements of theoffloaded tasks (different computing requirements, latency, andso on) and limited MEC capabilities, we jointly decide on thetask offloading (tasks to application assignment) and scheduling(order of executing them), which yields a challenging problemof combinatorial nature. Furthermore, we jointly decide onthe computing resource allocation for the hosted applications,and we refer this problem as the Dynamic Task Offloadingand Scheduling problem, encompassing the three subproblemsmentioned earlier. We mathematically formulate this problem,and owing to its complexity, we design a novel thoughtfuldecomposition based on the technique of the Logic-Based BendersDecomposition. This technique solves a relaxed master, withfewer constraints, and a subproblem, whose resolution allowsthe generation of cuts which will, iteratively, guide the masterto tighten its search space. Ultimately, both the master and thesub-problem will converge to yield the optimal solution. We showthat this technique offers several order of magnitude (morethan 140 times) improvements in the run time for the studiedinstances. One other advantage of this method is its capability ofproviding solutions with performance guarantees. Finally, we usethis method to highlight the insightful performance trends fordifferent vertical industries as a function of multiple systemparameters with a focus on the delay-sensitive use cases.

Index Terms— Multi-access edge computing, Internet ofThings, 5G, task offloading, resource allocation, scheduling.

I. INTRODUCTION

THE expectations towards a premium Quality of Expe-rience (QoE) are widely increasing with the recent

advancement of 5G networks, paving the way towards a broad

Manuscript received October 10, 2018; revised January 10, 2019; acceptedJanuary 13, 2019. Date of publication January 31, 2019; date of current versionFebruary 14, 2019. This work was supported in part by NSERC and in partby Concordia University, Canada. (Corresponding author: Chadi Assi.)

H. A. Alameddine, S. Sebbah, and C. Assi are with Concordia University,Montreal, QC H3G 1M8, Canada (e-mail: [email protected]).

S. Sharafeddine is with Lebanese American University, Beirut, Lebanon.S. Ayoubi is with INRIA, Paris, France.Color versions of one or more of the figures in this paper are available

online at http://ieeexplore.ieee.org.Digital Object Identifier 10.1109/JSAC.2019.2894306

new set of services such as augmented/virtual reality, trafficsafety, image and face recognition, etc. [1], [2]. Enablingproximity services with fast service delivery at anytime incrowded areas, comes hand-in-hand with the goal of 5Gsystems in providing user-centric QoE that includes but isnot limited to ultra-low latency (i.e., ≤ 5ms), ultra-highreliability (i.e., 99.999%) and a support of 1000 timeshigher data volumes [1], [3]. The need for the support ofhigher data volumes comes as a result of the tremendousforeseen increase in the number of wearable Internet ofThings (IoT) devices expected to reach 50 billion in 2020 asreported by Cisco [4] along with a subscription uptake fore-cast for 5G to reach 1 billion by 2023 as predicted byEricsson [5].

The constrained computation and storage resources ofmobile User Equipments (UEs) and IoT devices, in addi-tion to their limited battery life make them not suitablefor supporting the processing of resources-hungry mobileand IoT applications providing 5G services such as onlinegaming and face/speech recognition applications [2], [6], [7].One possible solution to overcome these limitations is the useof Mobile Cloud Computing (MCC) which allows the UEsto offload their computation-intensive tasks to be processedon applications deployed on a powerful, centralized remotecloud accessible via the Internet or a core network of amobile operator, thus, expanding the capability of mobiledevices [2], [8]. However, such approach incurs high com-munication delays due to the distant location of the remotecloud from the user which violates the QoE requirements forsome real-time delay-sensitive applications. Hence, in orderto provide fast service delivery compliant with a real-timecontext, Mobile Edge Computing (MEC) has been introducedto bring cloud computing resources at the edge of the network,closer to the end-user [1], [9]. MEC was first introduced toaccelerate the advancements on edge computing in mobilenetworks and within the Radio Access Network (RAN). How-ever, the definition of MEC was then slightly modified to“Edge Computing refers to a broad set of techniques designedto move computing and storage out of the remote cloud(public or private) and closer to the source of data” [10] toaccommodate various set of access technologies (e.g., WiFi,Long-Term Evolution(LTE), etc.) [11]. As such, the EuropeanTelecommunications Standards Institute (ETSI) dropped the“Mobile” from MEC and renamed it Multi-access Edge Com-puting (MEC) [1], [11]. It is worth noting that other edgecomputing paradigms such as cloudlet and Fog computing

0733-8716 © 2019 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

https://orcid.org/0000-0001-6548-1624

ALAMEDDINE et al.: DTOS FOR LOW-LATENCY IoT SERVICES IN MEC 669

exist and differ by the access technology they use and theapplications they target [12].

MEC consists of edge servers deployed at the edge ofthe network and implemented either at the cellular BaseStations (BSs) or at the local wireless Access Points (APs) [6].As the MEC computation resources are more limited thanthose available in MCC, MEC servers can at anytime offloadtheir demanding tasks to MCC servers through the Internetwhenever they are overloaded. Further, multiple MEC serverscan collaborate and offload their tasks to each others (e.g., viaa backhaul network) to provide better services for the mobileusers through balancing their workloads and sharing theirresources [8]. For instance, a nearby edge server can choose tooffload the task of a connected UE to another edge server, if itdoes not host the required application to process the task, or ifits application is overloaded or can not be allocated enoughcomputing resources to process the task within its delayrequirement. Hence, efficiently utilizing MEC resources is nec-essary to guarantee its foreseen benefits which are intimatelyassociated with solving the following challenges: 1) The taskoffloading problem which consists of determining to whichedge server each task should be offloaded. More precisely,it consists of associating each task to an application hostedon an edge server and able to process it; 2) The applicationresource allocation problem which determines the computingresources to be allocated to each application1 deployed on anedge server in order to process all its assigned tasks withintheir delay requirements; 3) The task scheduling problemwhich decides on the order in which each task should beprocessed on the shared application while meeting its deadline.

While some work in the literature focused on determiningthe edge server to which each task should be offloaded and thecomputing resources it needs to be allocated [6], [13]–[16],others addressed the joint problem of task offloading andscheduling either through stochastic optimization [9], [17]or using algorithmic solutions [8], [18], [19]. The methodsexplored in existing works are approximate solutions whichdo not explore the benefits that can be brought by enablingdynamic modifications of the computing resources allocated tothe shared IoT applications and their impact on the schedulingof offloaded tasks.

In this work, we follow a more holistic approach for taskoffloading with joint resource allocation and scheduling withspecial focus on delay-sensitive IoT services [20]. This ismotivated by recent 5G standardization activities on ultra-lowlatency uses cases, in addition to the emerging trend of zerotouch networks empowered by virtualization technologies thatenable dynamic scaling of resources. Hence, our contributionscan be summarized as follows:

• We mathematically define and formulate the DynamicTask Offloading and Scheduling problem (DTOS) as aMixed Integer Program (MIP) (DTOS-MIP).

• Given its complexity, we explore the DTOS-LBBD,a Logic Based Benders Decomposition (LBBD) approachto efficiently solve the DTOS problem to optimality.

1An IoT application can be hosted on a Virtual Machine(VM) or a container/docker hosted at the MEC. The amount of resources of the VM/docker/container is to be determined.

The DTOS-LBBD decomposes the DTOS problem intoa master problem that performs the task offloading andthe resource allocation; and multiple sub-problems, eachaddresses the scheduling of tasks offloaded to a singleIoT application.

• Extensive numerical evaluations are carried out to exam-ine the efficiency of the DTOS-LBBD compared to theDTOS-MIP in terms of runtime. In addition, valuableperformance trends are explored to highlight the impactof task offloading and scheduling in meeting the diverseQoE requirements aligned with 5G vertical industries.

The remainder of the paper is organized as follows. Section IIpresents the literature review. Section III discusses the systemmodel and motivates the problem. Section IV defines andmathematically formulates the DTOS problem. Section Vpresents and explains the various aspects of the DTOS-LBBDapproach. Our numerical evaluation is depicted in Section VI.We conclude in Section VII.

II. LITERATURE REVIEW

A. Joint Task Offloading and Resource Allocation

Sun and Ansari [14] solved the latency-aware workloadoffloading (LEAD) problem where they mathematically for-mulated the task offloading problem under the objective ofminimizing the average response time for mobile users andpresented an algorithmic approach to solve it efficiently. Thelimited resource pool in MEC in comparison to a centralizedcloud computing motivated many work in the literature tojointly address the resource allocation and task offloadingproblem in the quest to efficiently utilize these resources [11].

Tran and Pompili [6] mathematically formulated the jointproblem of task offloading and resource allocation in MECwhere they do not only account for the computing resourceallocation but also for the transmission power allocation ofmobile users. Given the NP-hardness of the problem, theydecompose it into a task offloading problem which theysolve using a heuristic approach and a resource allocationproblem which they solve using convex and quasi convexoptimization techniques. Unlike [6], the work of [13] focusedon minimizing the cost of the online resource allocation inMEC under unpredictable resources prices and user mobility.Wang [13] provide an online optimization-based algorithm tosolve the resource allocation problem of mobile users at eachtime slot by considering adapting their allocated resourcesbased on the optimal solution obtained at the previous one.

The work in [15] and [16] accounted for mobile usersrequesting a determined type of IoT applications. With theobjective of minimizing the average response time in terms ofnetwork and computing delays, Fan and Ansari [16] formu-lated and addressed the problem of placing IoT applicationsof different types on existing MEC while deciding on theircomputing resources and determining the tasks that will beoffloaded to each of them. In their work, they specified amaximum allowable computing delay for each application.In contrast, Jia et al. [15] considered a Network FunctionVirtualization (NFV) based MEC, where they solved the taskoffloading problem for a set of mobile users requesting a


specific type of Virtual Network Function (VNF) (a softwareimplementation of a network function) within specific latencyrequirements. The work in [7] aimed at evaluating the impactof edge computing and its enabling technologies on theresponse time. Thus, Premsankar et al. [7] consider the specialuse case of a mobile gaming 3-D application and evaluatethe response delay incurred by offloading the tasks to beprocessed on edge servers deployed at three different loca-tions (local deployment, special-purpose cloud infrastructureand commercial public cloud). Their experimental evaluationshows that the location of edge servers and the virtualizationtechnology used (i.e., container, Virtual Machine (VM), barmetal) highly impacts the latency experienced. Other workson the task offloading and resource allocation problem havebeen reviewed in [12] and [21].

B. Task Scheduling

An efficient resource utilization entails a smart orchestrationof resources sharing which can not be accomplished without aproficient scheduling of their utilization. Hence, to maximizethe revenue of the infrastructure owner, Katsalis et al. [17]devised a Lyapunov optimization framework to address theVMs placement and scheduling problem while accounting forthe Service Level Agreement (SLA) for time-critical services.Their scheduling approach considers the scheduling of thenumber and the type (small, medium, large) of VMs to deployat each time slot for each mobile service operator basedon the variability of its workload. Similarly, Lyu et al. [9]employed a Lyapunov function to decide on the offloadingschedules of task while stochastically maximizing the networkutility under partial out-of-date network states informationwithout any consideration for the tasks delay requirements.Mao et al. [18] jointly optimized the task offloading andscheduling problems along with the transmit power allocationproblem. They were interested in minimizing the weightedsum of execution delay and service energy consumption. Theydecomposed the problem into a task offloading and schedulingproblem which they solve using the Johnson’s algorithm withthe objective of minimizing the makespan of all the jobs anda transmit power allocation problem which they address usingconvex optimization. The work in [18] considers that all thejobs are sharing a single-CPU core at the edge server whileoverlooking the deadline requirements of the tasks.

The task assignment and scheduling problem was investi-gated by Wang et al. [19] as well, who presented an algorith-mic solution to solve the problem while accounting for thedeadline requirement of the tasks and for the scheduling oftheir transmission and computation. Tan et al. [8] solved thetask offloading problem while accounting for the possibilityof dispatching jobs to a remote cloud as well as to an MEC.In addition, they presented a preemptive scheduling for theoffloaded tasks with the objective of minimizing their weightedresponse time using an online algorithm.

C. Novelty of Our Work in Comparison to the Literature

While only few works in the literature accounted for thedeadline of the offloaded tasks and their demands on being

processed by specific type of IoT applications, we target inthis work, the QoE requirements of the rising 5G services ofdifferent business verticals by presenting a complete offloadingscheme that accounts for a joint resource provisioning of IoTapplications as well as a fine grained task scheduling to meetthe delay sensitive requirements of these 5G services. We showthat such computing resource provisioning directly affects thescheduling decisions and impacts the number of tasks that canbe admitted to the network. Further, another major contributionof this work is the solution methodology and the achievedscalability. Although the DTOS problem is very complex,to the best of our knowledge, we are the first to present aLBBD framework for the this problem that is able to achieveseveral order of magnitude of faster run times while providingthe optimal solution.

III. SYSTEM MODEL

The tremendous move towards smart cities is gainingmomentum with the development of 5G. Smart cities willenable several services such as smart traffic management,security, energy efficiency and smart health care which makeuse of multiple IoT devices and applications. We consider asmart city Wide Area Network (WAN) as depicted in Fig.(1),composed of a set S of cellular base stations which canbe represented by either a macro cell (eNB) or a smallcell (SCeNB). For simplicity and without loss of generality,we will represent by eNB any type of base station. In orderto enable flexible routing and communications among eNBs,we consider that the core cellular network is enabled withSoftware Defined Network (SDN) technology. SDN is a newparadigm that simplifies network management by logicallycentralizing the control logic in one centralized entity calledSDN controller. Instructed by network applications about thedesired network policy (e.g., a routing application decides onthe path of a flow); the SDN controller installs the appropriateforwarding rules in all forwarding devices’ routing tables suchas those in openFlow switches [22]. Thus, we consider anSDN-based cellular network enabled with SDN controllers andopenFlow switches [16]. The SDN controllers benefit from aglobal view of the network and can be used to provide somemonitoring based information such as the latency experiencedby a flow between two eNBs [23]. A subset of eNBs in S aremounted with MEC servers to provide computation offloadingservices to the IoT users (i.e., User Equipments (UEs) suchas tablets and wearable devices) (Fig.(1)). The MEC-enabledsystem that we consider operates in a time-slotted structure,where we denote by δ ∈ Δ a time slot.

Let M be a set of deployed MEC servers; each MECserver m ∈ M consists of a pool of physical servers withan aggregated computing capacity cm specified in terms ofcycles/second or MHz. The MEC servers are hosting a set Aof IoT applications of multiple types (e.g., face recognition,video encoding, etc.) designated to process the offloaded tasksof UEs. Each application a ∈ A is a software which can bedeployed on top of a VM or a container hosted on a MECserver m ∈M . It is of specific type that we depict by ta ∈ Twhere T is a set of IoT application types. Like any othersoftware, an application a ∈ A requires some minimum system


Fig. 1. System model.

specifications to be able to operate efficiently. For simplicityand without loss of generality, we represent the minimumsystem requirements of an application a by the minimumprocessing capacity pa

min, it requires to operate. However, eachapplication can be provisioned with some computing resourcespa, that exceed its minimum requirement pa

min, in order tomaximize the workload it can process within a time limit. Theprocessing capacity pa of an application is represented in termsof cycles/second and is related to the virtual CPU (vCPU)resources (number of cores) assigned to the VM/containeron top of which the application is running [24]. Further,we assume that these applications can be shared by many UEsbut can process the task of one UE at a time.

A. UEs Computation Tasks

We consider a set U of UEs requesting to offload theirdelay-sensitive tasks to be processed by an IoT applicationa ∈ A of a suitable type deployed on an edge server m ∈M .In this work, we account for a quasi-static scenario where theset U of UEs remains unchanged during the offloading period,2

however, it may change across different periods. We considerthat each UE u ∈ U has one computation task at a time. Thus,in the following we use task and UE interchangeably. We rep-resent each task by a tuple < tu, μu, θu > where tu ∈ Tdepicts the type of the IoT application required to processthe task of UE u ∈ U . μu represents the workload (cycles)required to accomplish the processing of the task of UE uand can be obtained by profiling the task execution [25].θu denotes the latency requirements (i.e., deadline) in terms

2We leave the study of the dynamic scenario where mobile UEs arrive anddepart dynamically during an offloading period for future work.

of time slots of the task of UE u. Note that, if the latter wasnot processed within its deadline, it will be rejected from thenetwork.

B. Experienced Delays

Processing the offloaded tasks with respect to their latencyrequirements entails deciding on the MEC server to whicheach of the tasks should be offloaded, determining the com-putation resources to allocate to the IoT applications thatwill process the tasks, in addition to specifying the order inwhich the offloaded tasks should be processed by each ofthe applications. Solving the three aforementioned challengeshighly impacts the admission of the tasks to the network asthey directly affect some of the delays they experience. In thefollowing, we summarize the delays that an offloaded taskexperiences.

1) Upload Delay duup: The task uploading delay corresponds

to the time required to transmit the task from the UE u tothe serving eNB. We assume that the serving eNB s ∈ Sof each UE u is the base station with the highest receivedsignal quality. For simplicity and without loss of generality,we assume that du

up is predefined and can be calculated basedon the Signal to Interference plus Noise Ratio (SINR) asexplained in [6].

2) Edge-to-Edge Delay duee: Once the task of UE u is

uploaded, it should be processed by an IoT application a ∈ Aof a suitable type, deployed on the MEC m ∈M to which thetask was offloaded. It is of the best interest of u to have its taskprocessed by the MEC attached to its serving eNB to avoidany additional network delays. However, the serving eNB maynot be enabled with MEC capabilities (UE1 in Fig.(1)), or theMEC server attached to it may not be able to process the task


of u within its deadline θu; that is, 1) the MEC server m isnot hosting an application instance a of the same type of thatrequired by u (ta! = tu) or 2) the hosted application instancea of the same type requested by u does not have enoughprocessing capacity pa to meet the task’s deadline, or 3) a isoverloaded and hence, the task of UE u will have to experiencelong waiting delay in its buffer before being processed as othertasks were scheduled before it since a can process the task ofone UE at a time. Thus, in any of these situations, the taskof UE u can be offloaded to another MEC server m′ that isable to process it with respect to its QoE requirements. In thiscase, the serving eNB needs to transmit the task to anothereNB s ∈ S where m′ is hosted. Hence, we denote by du

ee,the delay incurred for transmitting the task of a UE u from itsserving eNB to the eNB connected to the MEC server wherethe task of u will be processed. As our SDN-based cellularcore network can be used to establish a routing path betweentwo eNBs [14], the edge-to-edge delay du

ee can be measuredby the SDN controllers enabled with monitoring tools such asSLAM [23]. Thus, we define a matrix H with elements hm

u torepresent the value of du

ee for each UE u ∈ U to each MECserver m ∈ M . Note that du

ee = hmu = 0, if the MEC server

m is attached to the serving eNB of u.3) Waiting Delay du

wait: When the task of UE u reaches theMEC server m hosting the IoT application a that can processit, it may experience some waiting delays, that we denote bydu

wait, in the buffer of a. Such delay depends on the schedulingorder and the size of tasks assigned to a.

4) Processing Delay duproc: Once the task of u starts process-

ing on its assigned application a, it will experience someprocessing delay, that we depict by du

proc. duproc is the time

taken by a to execute the task of UE u and is inverselyproportional to the computing resources allocated to a asspecified in Eq.(1) where μu and pa are as defined above.

duproc =

μu

pa(1)

5) Download Delay dudown: Once the execution of the task

of a UE u by IoT application a is finalized, the output shouldbe transmitted back to u. As the size of the output is usuallymuch smaller than the initial size of the task, we assume thatthe download delay incurred by transferring the output to u isnegligible [6]. Thus, we consider that du

down = 0.In the following, we address the joint problem of task

offloading, application resource allocation, and task schedulingand refer to it as the Dynamic Task Offloading and Schedul-ing (DTOS) problem.

IV. DYNAMIC TASK OFFLOADING AND SCHEDULING -A MIXED INTEGER PROGRAM (DTOS-MIP)

We define and mathematically formulate the Dynamic TaskOffloading and Scheduling (DTOS) problem as a MixedInteger Program (MIP).

A. Problem Definition

Let G(N,E) be a physical network consisting of a set ofnodes N = R∪M ∪S where R is a set of physical equipment

TABLE I

PARAMETERS OF THE DTOS-MIP

(e.g., switches, routers, etc.) and M is a set of MEC serversattached to a set S of eNB; E is a set of links connecting them.Let A be the set of IoT applications of different types deployedon the MEC servers m ∈ M , and let U be the set of UEsrequesting to offload and process their latency-sensitive taskson these applications. The DTOS problem can be formallydefined as follows:

Definition 1: Given a physical network G(N ;E), a set Uof UEs, each UE requesting to offload and process a generatedtask on an IoT application of the same type deployed on one ofthe MEC servers m ∈ M ; determine the optimal assignmentof the tasks generated by UEs to the set of applications a ∈ A,provision computing resources for each application a andschedule the processing of tasks assigned to each of them inorder to maximize the number of admitted tasks with respectto their latency requirements.

B. Problem Formulation

Table I delineates the parameters used in the formulationof the DTOS-MIP problem presented below. We define thevariable yaδ

u to determine that the IoT application a ∈ Astarted processing the task of UE u ∈ U at time slot δ ∈ Δ.

yaδu =

⎧⎪⎨

⎪⎩

1 if task of UE u ∈ U started its processing on IoT

application a ∈ A at time slot δ ∈ Δ,

0 otherwise.


Our objective is to maximize the number of admitted tasks(Eq.(2)). A task of a UE u ∈ U is admitted if it can beprocessed by an IoT application a ∈ A within its specifieddeadline θu.

Maximize∑

u∈U

∑

a∈A

∑

δ∈Δ

yaδu (2)

In order to meet our objective, several constraints that weelucidate in the following, have to be respected. Towardsdefining these constraints, we declare:pa ∈ R

+ as a decision variable that determines the processingcapacity allocated to an IoT application a ∈ A.We introduce the variable na ∈ {0, 1} to depict that an IoTapplication a ∈ A is used, that is, it is processing at least onetask (1), or not (0).

na =

{1 if IoT application a ∈ A is used,

0 otherwise.

Further, we declare sauu′ ∈ {0, 1} as a decision variable to

indicate that task of UE u started processing on IoT applicationa before the task of UE u′.

sauu′ =

⎧⎪⎨

⎪⎩

1 if task of UE u started processing on application

a ∈ A before the task of UE u′,0 otherwise.

In order to maximize the number of admitted tasks, we needfirst to decide on the computing resources to allocate to thedeployed IoT applications. Hence, we define Eq.(3) and Eq.(4)to specify that an IoT application a is used if at least one taskis scheduled to be processed on it, and to ensure that it is notused otherwise.

na ≤∑

u∈U

∑

δ∈Δ

yaδu ∀a∈A (3)

Hna ≥∑

u∈U

∑

δ∈Δ

yaδu ∀a∈A (4)

Eq.(5) guarantees that a used IoT application a ∈ A is at leastallocated the minimum computing resources pa

min it requiresto operate.

pa ≥ napamin ∀a ∈ A (5)

Eq.(6) guarantees that the maximum computing capacity thatcan be assigned to an IoT application a ∈ A can not exceedthe capacity of the MEC server m ∈M hosting it.

pa ≤ na

∑

m∈M

xamcm ∀a ∈ A (6)

Note that Eq.(5) and Eq.(6) ensure that an application a ∈ Awill not be allocated any computing resources if it is not used.Eq.(7) guarantees that the capacity of an MEC server m ∈Mis not violated.

∑

a∈A

xampa ≤ cm ∀m ∈M (7)

A valid task offloading suggests that the task of a UE u cannot be scheduled on more than one application a ∈ A (Eq.(8)).

∑

a∈A

∑

δ∈Δ

yaδu ≤ 1 ∀u ∈ U (8)

Further, the task of a UE u ∈ U , can not be scheduled by anIoT application a ∈ A which is of different type ta ∈ T thanthe requested one tu ∈ T (Eq.(9)).

∑

u∈U

∑

a∈A:(ta!=tu)

∑

δ∈Δ

yaδu ≤ 0 (9)

In addition, Eq.(10) guarantees that an IoT application a ∈ Acan at most process one task in a time slot.

∑

u∈U

yaδu ≤ 1 ∀δ∈Δ

∀a∈A (10)

As we assume a non-preemptive scheduling, we presentEq.(11) and Eq.(12) to ensure that an IoT application a ∈ Aprocesses a UE task u ∈ U completely before starting a newtask. Thus, an IoT application a can not start processing thetask of UE u′ before finishing the processing of the task of aUE u; that is only if u is scheduled before u′ on a (Eq.(11)).Similarly, an IoT application a can not start processing thetask of a UE u before finishing the processing of the taskof u′; that is only if u′ is scheduled before u on a (Eq.(12)).∑

δ∈Δ

yaδu′ δ ≥

∑

δ∈Δ

yaδu δ+du

proc−H(1−sauu′) ∀a∈A:(tu=tu′=ta)

∀u,u′∈U :(u!=u′)

(11)∑

δ∈Δ

yaδu δ ≥

∑

δ∈Δ

yaδu′ δ+du′

proc−H(1−sau′u) ∀a∈A:(tu=tu′=ta)

∀u,u′∈U :(u!=u′)

(12)

duproc in Eq.(12) and Eq.(11) determines the processing delay

experienced by task of UE u on the IoT application processingit. It is calculated as in Eq.(13).

duproc =

∑

a∈A

∑

δ∈Δ

yaδu

μu

pa∀u ∈ U (13)

Eq.(14) represents the precedence constraint of the scheduleof the tasks of UEs u and u′ on IoT application a ∈ A.

sauu′ + sa

u′u =∑

δ∈Δ

yaδu

∑

δ∈Δ

yaδu′

∀a∈A:(tu=tu′=ta)∀u,u′∈U :(u!=u′) (14)

An IoT application a can not start processing the task ofa UE u, unless the task is uploaded and transmitted to theapplication (Eq.(15)).

∑

a∈A

∑

δ∈Δ

yaδu du

up + duee ≤

∑

a∈A

∑

δ∈Δ

yaδu δ ∀u ∈ U (15)

where duee captures the edge-to-edge delay (Section III) and is

determined as specified in Eq.(16).

duee =

∑

m∈M

∑

a∈A

∑

δ∈Δ

yaδu xa

mhmu ∀u ∈ U (16)

Finally, since we are addressing tasks with stringent deadlines,we need to ensure that the total delay experienced by a taskof UE u ∈ U should not exceed its deadline as specified inEq.(17) where du

proc is as defined in Eq.(13).∑

a∈A

∑

δ∈Δ

yaδu δ + du

proc ≤ θu ∀u ∈ U (17)

Eqs. (11), (12), and (17) are non linear due to the non linearityof du

proc (Eq.(13)). Such non linearity is related to the division


by the decision variable pa which is multiplied by anotherdecision variable yaδ

u . Hence, in order to linearize it, we reducethe search space by allowing pa to take at most one specificvalue of a predefined set P instead of all the values in R

+.This is determined by Eq.(18).

∑

p∈P

zpa ≤ 1 ∀a ∈ A (18)

where zpa is a new decision variable defined as follows:

zpa =

{1 if a is allocated the processing capacity p ∈ P ,

0 otherwise.

Thus, Eq.(13) can then be rewritten as in Eq.(19).

duproc =

∑

a∈A

∑

δ∈Δ

yaδu

∑

p∈P

zpa

μu

p∀u ∈ U (19)

Similarly, pa can be replaced by∑

p∈P zpap in con-

straints (5), (6) and (7). Finally, Eqs.(11), (12), (14), (17)and (19) are non linear and can be easily linearized, but weomit the linearization details due to space limitation.

C. DTOS Complexity

DTOS is a MIP (DTOS-MIP) which is complex to solve.It is NP-hard given that it is a combination of three NP-hardproblems which are the task offloading problem, the appli-cation resource allocation problem and the non-preemptivetask scheduling problem. In fact, the task offloading problemcan be been proven as NP-hard via a reduction from thegeneralized assignment problem [26] while the applicationresource allocation problem can be proven as NP-hard via areduction from the two-dimensional bin-packing problem [27]where the MEC servers are the bins and the IoT applicationsare the objects to pack. Similarly, the task scheduling problemcan be proven as NP-hard via a reduction from the job-shopscheduling problem [28].

V. DTOS-LBBD: A LOGIC BASED

BENDERS DECOMPOSITION

Given the complexity of the DTOS problem, we devise inthe following a Logic-Based Benders Decomposition (LBBD)technique to solve it (DTOS-LBBD).

A. LBBD in a Nutshell

LBBD [29], is a row generation technique which followsthe “no-good learning” strategy. It consists of decomposing theproblem into a Master Problem (MP) representing a relaxationof the original model and one or more Sub-Problems (SPs).The SP, also known as inference dual, is an optimization overthe secondary variables while fixing the primary variables tovalues computed based on the solution of the MP. By assigningthe primary variables some trial values and solving the SP,the LBBD learns about the quality of other trial solutions,which are then used to reduce the number of solutions thatneed to be enumerated in order to find the optimal one. Moreprecisely, the solution of the SP is used to derive Benders’cuts which are added to the MP in order to cut infeasible

solutions from its solution space. An infeasible solution or anon globally feasible solution can be defined as follows:

Definition 2: An infeasible solution or a non globally fea-sible solution is a solution provided by the MP that is notfeasible to the SP (i.e., a solution that is unlikely to satisfy theSP constraints).

Benders’ cuts are derived from the inference dual whichcan be defined as the problem of inferring the tightest possiblebound on the optimal value of the main problem [29]. LBBDconsists of iteratively solving the MP and the SPs, derivingand adding Benders’ cuts from the SPs to the MP until theMP and the SPs solutions converge.

B. DTOS Decomposition Strategy

The efficiency of LBBD relies on the decompositionapproach and the strength of the defined Benders’ cuts. Unlikethe classical Benders approach where Benders’ cuts can beeasily defined based on Lagrangean multipliers obtained fromthe solution of the SP dual, no standard scheme for generatingBenders’ cuts exists for LBBD [30]. As classical Benders isinappropriate for the DTOS problem given that it requires thatthe SP be a continuous linear or non linear program [30],designing an efficient LBBD with strong Benders’ cuts forthe DTOS problem is challenging.

Hence, our DTOS-LBBD consists of dividing the DTOSproblem into a MP which solves the task assignment and theapplication resource allocation problems, and multiple SPs toresolve the task scheduling problem. Jointly solving the taskassignment and the application resource allocation problemshas a positive impact on the efficiency of our DTOS-LBBD:1) It overlooks the granularity of scheduling the tasks withrespect to their latency requirements. Instead, it considersassigning the tasks of UEs to IoT applications of suitabletypes while allocating each of these applications the minimumprocessing capacity required to meet the deadline of each ofthe assigned tasks. This, indeed, provides an upper bound onthe number of tasks that can be scheduled as it representsa relaxation of the original problem. 2) Since applicationschedules are independent from each other, our DTOS-LBBDdesign allows us to benefit from a distributed schedulingscheme by devising a scheduling problem for each used IoTapplication. Defining an SP per IoT application may not bepossible if the application resource provisioning is to be solvedas part of the LBBD SP due to the MEC servers capacityconstraint which controls the processing capacity allocatedto its hosted applications. Finally, a distributed schedulingenables the parallel execution of the SPs, and hence, reducesthe overall computation time of the DTOS-LBBD.

As depicted in Fig.2, the DTOS-LBBD starts by solvingthe MP. The MP solution yields an assignment of a subset oftasks of UEs to the hosted IoT applications and a processingcapacity allocated to each of these applications. For each usedapplication a ∈ A, an SP is defined and fed by the set oftasks of UEs assigned by the MP to a and by the processingcapacity pa allocated to a. Let ψMPa be the total number oftasks of UEs assigned to a by the MP. Further, let ψSPa denotethe maximum number of tasks of UEs that can be scheduled


Fig. 2. DTOS-LBBD flowchart.

TABLE II

PARAMETERS OF THE MP

within their delay requirements on a by the SP. For everyused application a ∈ A, if (ψMPa > ψSPa ), a Benders’ cut isderived and added to the MP to guide it towards determininga better value for pa and hence performing task assignmentthat are likely to be all scheduled by the SP. The MP problemis solved again after adding the Benders’ cuts inferred fromall the SPs. This process is repeated until

∑a∈A ψMPa =

∑a∈A ψSPa when the optimal solution is reached.

C. The Master Problem (MP)

The MP specific parameters are detailed in Table II whilethe remaining ones are as specified in Table I. We define thedecision variable qa

u ∈ {0, 1} to determine that the task of UEu ∈ U is mapped to IoT application a ∈ A.

qau =

{1 if task of UE u ∈ U is mapped to IoT application a,

0 otherwise.

The objective of the MP is to maximize the number ofadmitted tasks (Eq.(20)).

ψMP = Maximize∑

u∈U

∑

a∈A

qau (20)

The MP is subject to several constraints; we start by defininga decision variable na ∈ {0, 1} to specify that an application

a ∈ A is used, that is, if a task of at least one UE is assignedto it.

na =

{1 if IoT application a ∈ A is used,

0 otherwise.

We define pa∈R+ as a decision variable that specifies the

processing capacity allocated to an application a ∈ A.MP basic constraints

The set of constraints includes Eq.(21) which depicts thatthe task of UE u ∈ U can be processed by at most one IoTapplication a ∈ A.

∑

a∈A

qau ≤ 1 ∀u ∈ U (21)

Further, Eq.(22) is formulated to prevent the tasks of UEs tobe processed on IoT applications of different type.

∑

u∈U

∑

a∈A:ta!=tu

qau = 0 (22)

Eq.(23) and Eq.(24) specify that an IoT application a ∈ A isused if at least one task is offloaded to be processed on it.

na ≤∑

u∈U

qau ∀a ∈ A (23)

Hna ≥∑

u∈U

qau ∀a ∈ A (24)

Eq.(25) guarantees that an IoT application a ∈ A, if used,should at least, be allocated its minimum required processingcapacity pa

min.

pa ≥ napamin ∀a ∈ A (25)

Eq.(26) guarantees that the maximum computing resourcesthat can be allocated to an IoT application a ∈ A can notexceed those of the MEC server hosting it.

pa ≤ na

∑

m∈M

xamcm ∀a ∈ A (26)

Eq.(25) and Eq.(26), guarantee that no computing resourcescan be assigned to an unused IoT application a ∈ A. We defineEq.(27) to ensure that the capacity of each MEC serverm ∈M is respected.

∑

a∈A

xampa ≤ cm ∀m ∈M (27)

Strengthening the MP formulationWhile the above MP formulation provides an upper bound

on the optimal solution of the DTOS problem, our experimentshave shown that it is not sufficient to efficiently solve big testinstances. Therefore, we strengthen the MP formulation byadding valid inequalities to further tighten the solution space.Thus, we first add Eq.(28) to determine a lower bound onthe processing resources to be allocated to each applicationa ∈ A based on its assigned tasks. The minimum processingresources needed to process all the assigned tasks to an appli-cation a can be determined based on Eq.(1) by considering themaximum processing time available to them on a, and whichcan be calculated by deducting the earliest arrival time σa

min

to a (i.e., σamin = min σa

u ∀u ∈ U : (tu = ta) where σau


accounts for the upload and edge-to-edge delays of task of UEu to a) from the maximum deadline θa

max of all tasks of UEsrequiring the same type of a.

pa ≥∑

u∈U :(tu=ta) μuqau

θamax − σa

min

∀a ∈ A (28)

Similarly, pa can be lower bounded by the computingresources that, if allocated to a, allows the task of UE uto meet its deadline on a. Thus, we determine such com-puting resources set P a

u by a pre-processing scheme throughfirst applying Eq.(29) to calculate the minimum processingresources (pua

min) required on a to meet the deadline of thetask of UE u. P a

u can then be determined by adding allthe processing capacities p ∈ P that exceeds pua

min anddo not surpass the capacity of the MEC server hosting a(P a

u = {p ∈ P : puamin ≤ p ≤ ∑

m∈M xamcm}).

puamin =

μu

θu − σau

∀u∈U∀a∈A:(tu=ta) (29)

Hence, we define βjua ∈ {0, 1} as a new decision variable to

determine the processing capacity j ∈ P au which is selected

to process the task of UE u ∈ U on application a ∈ A.

βjua =

⎧⎪⎨

⎪⎩

1 if j ∈ P au is used for processing the task of

UE u on a,

0 otherwise.

We add the inequality depicted in Eq.(30) as a constraint inthe MP to specify that pa should be greater than or equal tothe maximum processing resources chosen to process any ofthe tasks of UEs u ∈ U assigned to it.

pa ≥∑

j∈P au

jβjua

∀u∈U∀a∈A (30)

Eq.(31) is added to guarantee that one processing capacity ischosen for the task of UE u ∈ U on application a ∈ A, if andonly if, it is mapped on a.

∑

j∈P au

βjua = qa

u∀u∈U∀a∈A (31)

As some tasks may experience high edge-to-edge delays ifthey were assigned to an application a of the same type hostedon a certain MEC m, the minimum processing capacity theyrequire to meet their deadline on a may be very high andsurpasses the capacity cm of MEC m that can be provisionedto a (i.e, P a

u = ∅). Thus, the assignment of such tasks tothese applications will always lead to their rejection. Hence,we determine Eq.(32) to prune such assignments.

∑

u∈U :(P au =∅)

∑

a∈A:tu=ta

qau = 0 (32)

Note that as in DTOS-MIP, we replace pa by∑

p∈P zpap in

Eqs. (25), (26), (27), (28) and (30) where zpa ∈ {0, 1} is a

decision variable as defined in Section IV. We add Eq.(18)to guarantee that one processing capacity is allocated to anapplication a ∈ A.

TABLE III

PARAMETERS OF THE SP

D. The Sub-Problem (SP)

Table III depicts the parameters of the SP model. Theremaining parameters are as specified in Table I. The SP for-mulation is presented in the following. We define αu ∈ {0, 1}as a decision variable which determines whether the task ofUE u ∈ Ua is admitted on application a; that is it was able tobe scheduled within its latency requirements on a.

αu =

{1 task of UE u ∈ Ua is admitted on a,

0 otherwise.

We define the decision variable yu ∈ N+ to determine the

time slot at which the task of UE u ∈ Ua starts processing onapplication a. Further, we declare suu′ ∈ {0, 1} as a decisionvariable to indicate whether the task of UE u ∈ Ua startedprocessing on application a before the task of a UE u′ ∈ Ua.

suu′ =

⎧⎪⎨

⎪⎩

1 task of UE u ∈ Ua started processing on

application a before the task of UE u′ ∈ Ua,

0 otherwise.

The objective of the SP is to maximize the number of admittedtasks (Eq.(33)).

ψSPa = Maximize∑

u∈Ua

αu (33)

The SP is subject to several constraints. Eq.(34) is used toguarantee that the tasks of UE u ∈ Ua can not start processingon a before its arrival time, if it is admitted.

yu ≥ σauαu ∀u ∈ Ua (34)

Further, the application a should guarantee the consecutiveprocessing of the task of UE u ∈ Ua during all its requiredprocessing time. Hence, Eq.(35) and Eq.(36) ensure that notwo tasks can be scheduled on a at the same time.

yu≥yu′ + du′procαu′ −H(1 − su′u) ∀u, u′ ∈ Ua : (u!=u′)

(35)

yu′ ≥ yu + duprocαu−H(1 − suu′) ∀u, u′ ∈ Ua : (u! = u′)

(36)

where duproc and du′

proc determine the processing delays of uand u′ (Eq.(1)). Eq.(37) represents the precedence constraintof the schedule of the tasks of UEs u and u′ on a.

suu′ + su′u = αuαu′ ∀u, u′ ∈ Ua : (u! = u′) (37)


Fig. 3. Benders’ cut insights.

Finally, Eq.(38) ensures that the total delay experienced by atask of UE u ∈ Ua does not exceed its latency requirement.

yu + duprocαu ≤ θu ∀u ∈ Ua (38)

Note that, Eq.(37) is non linear and can be easily linearized,however, we remove the linearization details due to spacelimitation.

E. Benders’ Cut

When an SP fails to schedule all the tasks of UEs assignedto application a by the MP, a Benders’ cut has to be generatedand added to the MP to prune such solution and similar nonfeasible ones. In fact, the failure of the SP to schedule all ofthe assigned tasks is a result of an allocation of the MP of lowcomputing resources pa to application a. Hence, a Benders’cut can be added to guide the MP to either increase the valueof pa while keeping the same assignment of tasks, or to assignfewer tasks on the application to match those that were ableto be admitted by the SP. While such cut is valid, we believeit is not strong enough as it only prunes the solutions sent bythe MP without considering any other similar infeasible ones.

In order to define a stronger Benders’ cut, we try to identifythe solutions that are likely to be provided by the MP andwill be infeasible for the SP. Thus, we graphically depict anapplication a as a bin of height h = pa and of width w =θa

max − σamin which indicates the time horizon during which

the tasks of UEs u ∈ Ua have to be scheduled and processedon a (Fig.(3)(b)). Each task u ∈ Ua can be seen as a rectangleof height hu = j where j is the processing capacity assignedto it by the MP (j ∈ P a

u : βjua = 1) and width wu = du

proc

representing the processing time of u on a when assigned theprocessing capacity j. The scheduling problem can then beabstracted to a bin-packing problem where a is the bin andthe tasks are the objects to place in a. The geometrical sizeof the task u ∈ Ua can increase by its height if we increase j(i.e., more processing capacity) or by its width if we decrease j(i.e., extend its completion time) (Eq.(1)).

To elaborate, we consider the example shown in Fig.(3)where we assume two tasks u1 and u2, to be scheduled on

application a1 (Fig.(3)(a)). a1 can be presented as a bin, andu1 and u2 are the objects to be placed in it (Fig.(3)(b) andFig.(3)(c)). In Fig.(3)(b), we assign the task u2 computingresources j = 6 cycles/time slot which yields a processingdelay of 10 time slots to finish at t11. Increasing the computingresources to j = 8 cylces/time slot for the processing oftask u2 decreases its processing time to 8 time slots to finishat t9 (Fig.(3)(c)). However, in both cases it was not possibleto admit u1 on a1 and satisfy its deadline.

Hence, with a processing capacity pa1 = 8 cylces/time slot assigned to application a1, at most one of bothtasks u1 or u2 can be admitted on a1 when assigned anycomputing resources j ≤ pa1 . More precisely, varying theprocessing capacity j assigned to u1 or u2 will result in thesame infeasible solution. Thus, we conclude that, if the setof tasks u ∈ Ua were not able to be scheduled by the SPwith pa ∈ P , then for sure they will not be admitted withany other value (p′a < pa) ∈ P . Therefore, we define a cut(Eq.(39)) to prune such infeasible solutions where pa is theprocessing capacity allocated to application a by the MP atthe previous iteration, Ua is the set of tasks of UEs that wereadmitted by the SP on a and Ua is the set of UEs whose taskswere rejected by the SP. Note that Eq.(39) guides the MPtowards assigning to application a at most the same numberof tasks |Ua| that were admitted by the SP from the set oftasks in (Ua ∪u′∈Ua

). Such guidance is only applicable in the

case where the tasks in (Ua∪u′∈Ua) were assigned computing

resources j ≤ pa. The cut (Eq.(39)) is added for every taskrejected by the SP. Fig.(3)(d) depicts the cut that needs to beadded for the example in Fig.(3).

∑

u∈Ua

∑

j∈P au :(j≤pa)

βjua

︸︷︷︸Admitted tasks by SP

+∑

j∈P au′ :(j≤pa)

βju′a

︸︷︷︸One rejected task by SP

≤ |Ua|︸︷︷︸

number of admitted tasks by SP

∀u′ ∈ Ua

∀a ∈ A

(39)


To guarantee that the DTOS-LBBD converges to an optimalsolution, we need to prove that Eq.(39) is a valid Benders’cut [31]. A Benders’ cut is valid if it satisfies the followingtwo conditions [31]:

Condition 1: The cut must exclude the current MP solutionif it is not globally feasible.

Condition 2: The cut must not remove any global feasiblesolutions composed of any combination of tasks that wereselected by the MP at a previous iteration and requiring aprocessing capacity j ≤ pa.

Chu and Xia [31] show that Condition 1 guarantees finiteconvergence if the MP variables have finite domains, and thatCondition 2 guarantees optimality since the cuts never cutfeasible solutions.

Theorem 1: Benders’cut in Eq.(39) is valid.Proof: To prove the validity of our proposed cut, we need

to show that Condition 1 and Condition 2 are satisfied.We first prove that Condition 1 is satisfied. To show that

Eq.(39) cuts off infeasible solutions provided by the MP,we will show that Eq.(39) will not be satisfied if the sameset of tasks were admitted again by the MP on application a.Thus, we let U (i)

a be the set tasks which were admitted by theMP on application a at iteration i, hence resulting in a MPsolution which is globally infeasible (as found by the SP).Further, let U (i)

a be the set tasks which were admitted by theSP at iteration i (U (i)

a ⊂ U(i)a ). This will result in the cut

depicted in Eq.(40)∑

u∈U(i)a

∑

j∈P au :(j≤p

(i)a )

βjua +

∑

j∈P au′ :(j≤p

(i)a )

βju′a ≤ |U (i)

a | ∀u′∈U(i)a∀a∈A

(40)

If at a subsequent iteration k > i, the same set of tasks Ua

is admitted by the MP, the left hand side in Eq.(40) will beequal to |U (i)

a | + 1 as shown in Eq.(41) where βj(k)ua depicts

the MP solution at iteration k.∑

u∈U(i)a

∑

j∈P au :(j≤p

(i)a )

βj(k)ua

︸︷︷︸

|U(i)a |

+∑

j∈P au′ :(j≤p

(i)a )

βj(k)u′a

︸︷︷︸1

= |U (i)a |

+1 ∀u′∈U(i)a∀a∈A

(41)

The equality in Eq.(41) results from the following. Basedon Eq.(31), a constraint of the MP that should bevalid for its provided solution (βj(k)

ua ), we derive that∑

j∈P auβ

j(k)ua = 1. In addition, by accounting for the

validity of Eq.(30), we obtain∑

j∈P au :(j≤p

(i)a )

βj(k)ua = 1.

Hence,∑

u∈U(i)a

∑

j∈P au :(j≤p

(i)a )

βj(k)ua = |U (i)

a |. Similarly,∑

j∈P au′ :(j≤p

(i)a )

βj(k)u′a = 1. This explains the equality depicted

in Eq.(41) which indeed shows that the MP solution violatesthe cut presented in Eq.(40). This proves that Condition 1 issatisfied.

Next, we prove that Condition 2 is satisfied. As we needto show that the cut (Eq.(39)) does not cut any feasiblesolution, we will provide a proof by contradiction where weconsider a globally feasible solution W removed by the cut,and we show that such solution can not be feasible; where the

contradiction resides. Hence, we first consider the legitimateinfeasible solution I provided at iteration i and which resultedin the cut shown in Eq.(42), where not all the tasks assigned bythe MP were admitted by the SP with a determined processingcapacity p(i)

a .∑

u∈U(i)a

∑

j∈P au :(j≤p

(i)a )

βjua +

∑

j∈P au′ :(j≤p

(i)a )

βju′a ≤ |U (i)

a | ∀u′∈U(i)a∀a∈A

(42)

The cut in (Eq.(42)) is designed to remove any infeasiblesolution composed of a subset of tasks from those in U

(i)a

for any processing capacity less or equal than p(i)a assigned to

application a, and should not remove any feasible solutions.To prove this by contradiction, we consider a globally feasiblesolution W found at iteration w > i that was removed by thecut (Eq.(42)). That is, in W , the tasks in U

(i)a admitted in I

in addition to one or more tasks in U(i)a (that were rejected

in I) are admitted in W with a processing capacity p(w)a ≤ p

(i)a .

Therefore, the opposite of the cut in Eq.(42) which is presentedby Eq.(43) is valid for W .

∑

u∈U(i)a

∑

j∈P au :(j≤p

(i)a )

βj(w)ua +

∑

j∈P au′ :(j≤p

(i)a )

βj(w)u′a > |U (i)

a |

(43)

As the tasks in W are assigned a processing capacityp(w)a ≤ p

(i)a , their processing time on application a will

increase in comparison to that observed with p(i)a , and hence

the total schedule length of all tasks will be greater or equalto that obtained with p

(i)a . As such solution W is feasible,

any other solution where the tasks experience less processingdelay than that observed with p

(w)a should also be feasible

(i.e., tasks meet their deadlines). For the tasks to experienceless processing delays, they need to be assigned a processingcapacity higher than p(w)

a which is the case of solution I whichis infeasible. Hence, W can not be feasible which completesthe proof.

VI. PERFORMANCE EVALUATION

We carry out an extensive empirical study to evaluatethe performance of our DTOS-LBBD approach against theDTOS-MIP. Further, we explore the engineering impact ofthe DTOS problem under varying system parameters andQoE requirements. We highlight the influence of the differentproblems solved (i.e., task offloading, application resourceallocation and task scheduling) on serving multiple verticalindustries while analyzing the effectiveness of our proposedDTOS-LBBD framework.

A. Experimental Setup

In our numerical study, we consider networks of differentsizes with varying number of MEC servers, each having acapacity of cm = 20Ghz [6]. We account for |T | = 5 differenttypes of varying number of IoT applications that belong tothe same industry vertical (unless stated otherwise). EachIoT application requires minimum computing resources (pa

min)


TABLE IV

LATENCY REQUIREMENTS OF DIFFERENTINDUSTRY VERTICALS [32], [33]

TABLE V

DTOS-MIP VS. DTOS-LBBD

randomly generated between [2−5] Ghz. The IoT applicationsare randomly placed on the MEC servers. We assume multipleUEs offloading tasks belonging to different industry verticalsand hence, are of varying QoE requirements. Thus, we depictin Table IV the different industry verticals accounted for inour tests, and present the range of their latency requirementsin addition to the ones used in our tests. We consider thatthe number of cycles (μu) demanded by UEs are randomlygenerated between [20 − 100] cycles. The upload and edge-to-edge delays of the offloaded tasks are randomly generatedbetween [1 − 2] ms and [1 − 3] ms [34] respectively. Allour numerical evaluations are averaged over 5 sets. They areconducted using Cplex version 12.4 to solve the MIPs on anIntel core i7-4790 CPU at 3.60 GHZ with 16 GB RAM.

B. DTOS-MIP vs. DTOS-LBBD

We start by evaluating the performance of DTOS-LBBDagainst the DTOS-MIP in terms of execution time as we varythe number of UEs’ offloaded tasks. Increasing the numberof offloaded tasks makes the problem harder to solve giventhe limited computing resources. Hence, we also look at thethe impact of such increase on the admission rate. Thus,we consider a network composed of |M | = 3 MEC serversand |A| = 15 IoT applications of |T | = 5 different typesrepresenting multiple industry verticals. The deadlines of theoffloaded tasks are randomly generated between [5 − 20] ms.Our results are presented in Table V.

1) Admission Rate: LBBD is an exact method which is ableto provide the optimal solution as shown in Table V, wherethe admission rates of the DTOS-MIP and DTOS-LBBDare equal. The same table depicts that as the number ofUEs increases the admission rate decreases. Such decrease isexpected as more tasks are contending for the same amount ofcomputing resources, hence, some of them will be sufferingfrom high waiting delays on some IoT applications, waitingfor them to be freed. This will negatively impact their latency

Fig. 4. DTOS-LBBD convergence.

requirements which will lead to miss their deadlines and getrejected from the network.

2) Execution Time: We evaluate the scalability of theDTOS-LBBD against the DTOS-MIP. Our results shownin Table V clearly depict that the DTOS-LBBD is much morescalable than the DTOS-MIP. In fact, it is able to providethe optimal solution on an average of 95% faster than theDTOS-MIP. This is because the LBBD learns from the qualityof the solution generated at each iteration to cut off similarinfeasible solutions from the solution space. This will restrictthe search space as the number of iterations increases andhence, will help reaching the optimal solution faster than theDTOS-MIP. In addition, the decomposition of the problem intomultiple sub-problems helps in reducing the execution time ofDTOS-LBBD especially that multiple scheduling SPs are runin parallel using threads.

C. Evaluation of DTOS-LBBD

We evaluate the performance of DTOS-LBBD under differ-ent system parameters while studying the engineering impactof the DTOS problem.

1) DTOS-LBBD Convergence: In order to evaluate theperformance of DTOS-LBBD, we account for a single testinstance and we plot in Fig.(4), the number of admittedtasks at each iteration as determined by the MP and theSPs. We consider a network of |M | = 10 MEC servershosting |A| = 15 IoT applications. We account for |U | = 30UEs’ tasks belonging to a factory automation industry vertical(θu = 10ms). Fig.(4) depicts that the objective of the MPrepresents an upper bound on the optimal objective value whilethe number of tasks admitted by the SPs represents a lowerbound. As the number of iterations increases, the objectivevalue of the MP decreases given that more Benders’ cuts areadded to it. In contrast, the number of tasks admitted by theSPs varies between the iterations depending on the require-ments of the tasks (i.e., number of cycles, arrival time) sent bythe MP at each of them. However, it is important to note thatthe optimal objective value always lies between the maximumlower bound and the minimum upper bound attained so far.Further, the variance of the gap existing between the upper andlower bound provides the option to terminate the DTOS-LBBDat anytime based on the desired solution quality and theruntime. For instance, one may terminate the DTOS-LBBDat iteration 14 with a gap of 9% between the upper and lowerbound, scarifying little in the quality of the solution while


TABLE VI

DTOS-LBBD EXECUTION TIME (MS)

gaining about 75.4% in terms of runtime. If a better solutionquality is desired, one can stop the DTOS-LBBD at iteration26 where the gap reaches 4.5%; however, the gain in terms ofcomputation time is about 53%.

2) Trade-Off Between Optimality Gap and Runtime: To fur-ther emphasize the fact that the LBBD approach representsan anytime algorithm that can be stopped at any iterationwhile providing a feasible solution, we show in Table VI anaveraged runtime of the DTOS-LBBD using the same networksettings mentioned in the previous paragraph. The resultsreported in Table VI depict the runtime of the DTOS-LBBDat the optimal solution and at the first occurrence of anoptimality gap which is either less than 10% or between10% and 20%. It is clear that for a determined number ofUEs, the runtime increases with the decrease of the optimalitygap. In fact, the runtime of DTOS-LBBD increases withthe increase of the number of iterations. In this case, moreBenders’ cuts are added to the MP tightening its solutionspace, hence, better locating the optimal solution which islikely to decrease the gap between the upper bound providedby the MP and the lower bound given by the SPs. Thus,stopping the DTOS-LBBD at a certain tolerable gap can leadto high gains in terms of computation time. For instance, when|U | = 30, 97.26% of gain in runtime is depicted when the gapis between 10% and 20%, while 89.65% is obtained with a gapless than 10%. Finally, it is worth noting that as the number ofUEs increases, the runtime of the DTOS-LBBD increases asthe size of the problem grows and hence, the problem becomesharder to solve.

3) Impact of Varying the Number of UEs: We vary thenumber of UEs and evaluate its impact for different industryverticals. Thus, we consider a network of |U | = 10 MECservers hosting |A| = 15 IoT applications. Our results pre-sented in Fig.(5) show that as the number of UEs increasesthe admission rate decreases for each vertical industry asmore tasks are contending the same computing resources (IoTapplications) which become overloaded and hence, fail to meetthe delay requirements of all the UEs requesting their service.In fact, some tasks will suffer from extra waiting delayswhich will lead them to miss their deadlines and thus, getrejected from the network. However, such waiting delays canbe tolerated if the latency requirements increased. For instance,one can note the tactile Internet vertical where the number ofUEs increased from |U | = 20 to |U | = 40 while the admissionrate decreased by 23% as the limited computing resourcesfailed to cope with such increase. In contrast, for |U | = 20,the admission rate increased to 100% for less latency sensitivetasks such as those belonging to Tele Surgery industry. Further,as Intelligent Transportation Systems (ITS) and Tele surgery

vertical industries possesses relatively high delay requirement,the admission rate of UEs requiring such types of services wasnot affected by the increase of the number of tasks and waskept constant to 100%.

4) Impact of Varying the Number of MEC Servers: We studyin Fig.(5) the impact of the increase of the computingresources for different vertical industries on the admission rate.Hence, we consider a network of varying number of MECservers hosting |A| = 15 IoT applications. We account for|U | = 30 UEs offloading tasks of varying latency require-ments. Fig.(5) shows that adding more MEC servers in thenetwork increases the amount of computing resources avail-able. This allows the hosted IoT applications to be provisionedmore processing capacity, which will reduce the processingtime of the assigned tasks. Thus, as tasks will be processedfaster by the applications, others waiting for the same resourceto be freed will experience less waiting delays, and hencetheir chances in meeting their deadlines and be admitted to thenetwork will increase. In addition, one can note that for a fixednumber of MEC servers, the admission rate increases with theincrease of the latency requirements; as less-sensitive tasks cantolerate more waiting delay on the shared IoT applications.For instance, with |M | = 3 MEC servers, the admission rateincreased by 73.34% as the deadline of the tasks increasedfrom 7ms for tactile Internet to 110ms for Tele surgery.Further, the 100% admission rate depicted for ITS and Telesurgery vertical industries for the varying number of MECservers depicts that |M | = 3 MEC servers were enoughto admit all the assigned tasks given their relaxed latencydemands.

5) Impact of Varying the Edge-to-Edge Delay: To explorethe impact of the edge-to-edge delay on the admission rate,we consider a network of |M | = 5 MEC servers hosting |A| =15 IoT applications. We account for |U | = 25 UEs belongingto a smart grid industry vertical (θu = 20ms) and we fixthe edge-to-edge delays for all the tasks to a defined value.Our results presented in Fig.(6) show that the edge-to-edgedelay increase becomes prohibitive in allowing the admissionof the tasks. In fact, it is of the best interest of each taskto be processed on an MEC server attached to its servingeNB in order to overcome the edge-to-edge delay. However,as the number of IoT applications is fixed in the network,some MEC servers may not be hosting certain types, further,some of their deployed applications may be overloaded. Thiswill force the tasks served by eNB attached to those MECservers to travel through the network to be processed on anIoT application hosted on another MEC server, which willmake them suffer from high edge-to-edge delay. With theirlatency-sensitive requirements, the mentioned tasks will be left


Fig. 5. Admission rate of DTOS-LBBD for different verticals. (a) Admission rate for varying number of UEs. (b) Admission rate for varying number ofMEC servers.

Fig. 6. Admission rate over varying edge-to-edge delay.

with very little processing time which the IoT application onwhich they are assigned might fail to meet, hence, leading totheir rejection from the network.

VII. CONCLUSION

In this paper, we motivated and studied the DTOS prob-lem which jointly addresses the task offloading, applicationresource allocation in addition to the task scheduling problemsin a MEC network. We alleviate virtualization technologies

capabilities in being able to automatically modify their allo-cated computing resources, by being the first to study the taskscheduling problem under undetermined computing resourcesallocation. Given the complexity of DTOS, we presenteda novel decomposition strategy implementing the LBBDtechnique. Our novel DTOS-LBBD method decomposes theproblem into a MP which solves the task offloading andapplication resource allocation problems; and multiple SPs,each addressing the scheduling of tasks on a single used IoTapplication. DTOS-LBBD is an exact method characterized byan anytime algorithm providing the opportunity to be termi-nated at any iteration, hence, realizing the trade-off betweenthe solution quality and the computation time. Through exten-sive simulations, we show that the DTOS-LBBD can achievemore than 140 order of magnitude improvement in terms ofruntime compared to the DTOS-MIP and can serve as a bench-mark algorithm to compare against other methods. Further,we explored the interleaving dependence and implications ofthe aforementioned DTOS sub-problems for different verticalindustries with variable latency requirements. While mainlytargeting low-latency services, we evaluated the trade-offs


existing between the number of MEC servers, the number oftasks and the latency requirements.

REFERENCES

[1] T. Taleb, K. Samdanis, B. Mada, H. Flinck, S. Dutta, and D. Sabella,“On multi-access edge computing: A survey of the emerging 5G networkedge cloud architecture and orchestration,” IEEE Commun. SurveysTuts., vol. 19, no. 3, pp. 1657–1681, 3rd Quart., 2017.

[2] P. Mach and Z. Becvar. (2017). “Mobile edge computing: A sur-vey on architecture and computation offloading.” [Online]. Available:https://arxiv.org/abs/1702.05309

[3] Experts from Members of the 5G Infrastructure Association, “5G vision-the 5G infrastructure public private partnership: The next generation ofcommunication networks and services,” European Commission and 5GInfrastructure Association, White Paper, Feb. 2015. [Online]. Available:https://espas.secure.europarl.europa.eu/orbis/sites/default/files/generated/document/en/5G-Vision-Brochure-v1.pdf

[4] D. Evans, “The Internet of Things: How the next evolution ofthe Internet is changing everything,” CISCO, San Jose, CA, USA,White Paper, Jan. 2011, vol. 1, pp. 1–11. [Online]. Available:https://www.cisco.com/c/dam/en_us/about/ac79/docs/innov/IoT_IBSG_0411FINAL.pdf

[5] New Capabilities With Distributed Cloud, Ericsson, Stockholm,Sweden. [Online]. Available: https://www.ericsson.com/en/digital-services/trending/distributed-cloud

[6] T. X. Tran and D. Pompili. (2017). “Joint task offloading and resourceallocation for multi-server mobile-edge computing networks.” [Online].Available: https://arxiv.org/abs/1705.00704

[7] G. Premsankar, M. Di Francesco, and T. Taleb, “Edge computing forthe Internet of things: A case study,” IEEE Internet Things J., vol. 5,no. 2, pp. 1275–1284, Apr. 2018.

[8] H. Tan, Z. Han, X.-Y. Li, and F. C. M. Lau, “Online job dispatch-ing and scheduling in edge-clouds,” in Proc. IEEE Conf. Comput.Commun. (INFOCOM), May 2017, pp. 1–9.

[9] X. Lyu et al., “Optimal schedule of mobile edge computing for Internetof things using partial information,” IEEE J. Sel. Areas Commun.,vol. 35, no. 11, pp. 2606–2615, Nov. 2017.

[10] A. Reznik et al., “Developing software for multi-access edge comput-ing,” Eur. Telecommun. Standards Inst. (ETSI), Europe, ETSI WhitePaper 20, 2017.

[11] H. Tanaka, M. Yoshida, K. Mori, and N. Takahashi, “Multi-access edgecomputing: A survey,” J. Inf. Process., vol. 26, pp. 87–97, Feb. 2018.

[12] C. Mouradian, D. Naboulsi, S. Yangui, R. H. Glitho, M. J. Morrow, andP. A. Polakos, “A comprehensive survey on fog computing: State-of-the-art and research challenges,” IEEE Commun. Surveys Tuts., vol. 20,no. 1, pp. 416–464, 1st Quart., 2017.

[13] L. Wang, L. Jiao, J. Li, and M. Mühlhäuser, “Online resource alloca-tion for arbitrary user mobility in distributed edge clouds,” in Proc.IEEE 37th Int. Conf. Distrib. Comput. Syst. (ICDCS), Jul. 2017,pp. 1281–1290.

[14] X. Sun and N. Ansari, “Latency aware workload offloading in thecloudlet network,” IEEE Commun. Lett., vol. 21, no. 7, pp. 1481–1484,Jul. 2017.

[15] M. Jia, W. Liang, and Z. Xu, “QoS-aware task offloading in distributedcloudlets with virtual network function services,” in Proc. 20th ACMInt. Conf. Modelling, Anal. Simulation Wireless Mobile Syst., 2017,pp. 109–116.

[16] Q. Fan and N. Ansari, “Application aware workload allocation foredge computing-based IoT,” IEEE Internet Things J., vol. 5, no. 3,pp. 2146–2153, Jun. 2018.

[17] K. Katsalis, T. G. Papaioannou, N. Nikaein, and L. Tassiulas,“SLA-driven VM scheduling in mobile edge computing,” in Proc. IEEE9th Int. Conf. Cloud Comput. (CLOUD), Jun./Jul. 2016, pp. 750–757.

[18] Y. Mao, J. Zhang, and K. B. Letaief, “Joint task offloading schedulingand transmit power allocation for mobile-edge computing systems,” inProc. IEEE Wireless Commun. Netw. Conf. (WCNC), Mar. 2017, pp. 1–6.

[19] L. Wang, L. Jiao, D. Kliazovich, and P. Bouvry, “Reconciling taskassignment and scheduling in mobile edge clouds,” in Proc. IEEE 24thInt. Conf. Netw. Protocols (ICNP), Nov. 2016, pp. 1–6.

[20] I. Farris, T. Taleb, H. Flinck, and A. Iera, “Providing ultra-short latencyto user-centric 5G applications at the mobile network edge,” Trans.Emerg. Telecommun. Technol., vol. 29, no. 4, p. e3169, 2018.

[21] P. Porambage, J. Okwuibe, M. Liyanage, M. Ylianttila, and T. Taleb.(2018). “Survey on multi-access edge computing for Internet of Thingsrealization.” [Online]. Available: https://arxiv.org/abs/1805.06695

[22] D. Kreutz, F. Ramos, P. E. Veríssimo, C. E. Rothenberg, S. Azodolmolky,and S. Uhlig, “Software-defined networking: A comprehensive survey,”Proc. IEEE, vol. 103, no. 1, pp. 14–76, Jan. 2015.

[23] C. Yu, C. Lumezanu, A. Sharma, Q. Xu, G. Jiang, and H. V. Madhyastha,“Software-defined latency monitoring in data center networks,” in Proc.Int. Conf. Passive Act. Netw. Meas. Berlin, Germany: Springer, 2015,pp. 360–372.

[24] Azure Windows VM Sizes—Compute Optimized. Microsoft, USA,2018. [Online]. Available: https://docs.microsoft.com/fi-fi/azure/virtual-machines/windows/sizes-compute

[25] L. Yang, J. Cao, H. Cheng, and Y. Ji, “Multi-user computation parti-tioning for latency sensitive mobile cloud applications,” IEEE Trans.Comput., vol. 64, no. 8, pp. 2253–2266, Aug. 2015.

[26] M. Yagiura and T. Ibaraki, “The generalized assignment prob-lem and its generalizations,” St. Mary’s College of Maryland,St. Mary’s City, MD, USA, Tech. Rep. [Online]. Available:http://faculty.smcm.edu/acjamieson/f12/GAP.pdf

[27] D. Pisinger and M. Sigurd, “Using decomposition techniques andconstraint programming for solving the two-dimensional bin-packingproblem,” INFORMS J. Comput., vol. 19, no. 1, pp. 36–51, 2007.

[28] Y. N. Sotskov and N. V. Shakhlevich, “NP-hardness of shop-schedulingproblems with three jobs,” Discrete Appl. Math., vol. 59, no. 3,pp. 237–266, 1995.

[29] J. N. Hooker and G. Ottosson, “Logic-based benders decomposition,”Math. Program., vol. 96, no. 1, pp. 33–60, 2003.

[30] J. N. Hooker, “Planning and scheduling by logic-based benders decom-position,” Oper. Res., vol. 55, no. 3, pp. 588–602, 2007.

[31] Y. Chu and Q. Xia, “Generating benders cuts for a general class ofinteger programming problems,” in Proc. Int. Conf. Integr. Artif. Intell.(AI) Oper. Res. (OR) Techn. Constraint Program. Berlin, Germany:Springer, 2004, pp. 127–141.

[32] M. A. Lema et al., “Business case and technology analysis for 5G lowlatency applications,” IEEE Access, vol. 5, pp. 5917–5935, 2017.

[33] P. Schulz et al., “Latency critical IoT applications in 5G: Perspective onthe design of radio interface and network architecture,” IEEE Commun.Mag., vol. 55, no. 2, pp. 70–78, Feb. 2017.

[34] R. Yu, G. Xue, and X. Zhang, “Application provisioning in FOGcomputing-enabled Internet-of-Things: A network perspective,” in Proc.IEEE Conf. Comput. Commun. (INFOCOM), Apr. 2018, pp. 783–791.

Authors’ photographs and biographies not available at the time of publication.

dynamic task offloading and scheduling for low...

Documents