04.03 imprecise task schedule optimization [i].pdf
TRANSCRIPT
![Page 1: 04.03 Imprecise Task Schedule Optimization [I].pdf](https://reader031.vdocuments.site/reader031/viewer/2022021405/577cc0951a28aba7119090f6/html5/thumbnails/1.jpg)
8/10/2019 04.03 Imprecise Task Schedule Optimization [I].pdf
http://slidepdf.com/reader/full/0403-imprecise-task-schedule-optimization-ipdf 1/11
![Page 2: 04.03 Imprecise Task Schedule Optimization [I].pdf](https://reader031.vdocuments.site/reader031/viewer/2022021405/577cc0951a28aba7119090f6/html5/thumbnails/2.jpg)
8/10/2019 04.03 Imprecise Task Schedule Optimization [I].pdf
http://slidepdf.com/reader/full/0403-imprecise-task-schedule-optimization-ipdf 2/11
1 Introduction
One of the most important issues in high-level synthesis is to obtain a good schedule so as to reduce the
total computation time of an application when the system is implemented. In order to produce such a
good schedule, the exact knowledge of execution time of each task is normally provided; however, fromtime to time, these values are uncertain. Most previous work assumed that each task is associated with
fixed computation time, e.g., either worst or average case is considered. Since the knowledge about
computation time of a task could be uncertain, these assumptions may mislead and result in producing
a schedule that gives a long execution time after the actual system is built [11]. Assuming the execution
time of a task is a random variable with probability distributions seems to be an interesting approach.
Existing scheduling algorithms could, therefore, be extended to compute the resulting schedule with
probabilistic information. Nevertheless, collecting the probability data is costly and time consuming.
Furthermore, an algorithm to handle probability values is more complex than manipulating imprecise
data in fuzzy arithmetic [7]. In other words, using fuzzy numbers and fuzzy arithmetic which includes
- operations would be a good choice to avoid such complications. In this paper, a task is
associated with a computation time represented by a triangular fuzzy number. Fuzzy arithmetic is
applied to compute a possible schedule length so that the best schedule can be decided. This notion
is, then, incorporated in an efficient scheduling optimization algorithm called rotation scheduling in
which fuzzy arithmetic helps select a schedule position for each task.
Many researchers have applied the fuzzy logic approach to various kinds of scheduling problem. In
compiler optimization, fuzzy set theory has been used to represent an unpredictable real-time event and
imprecise knowledge about variables [8]. Lee and others. applied the fuzzy inference technique to finda feasible real-time schedule where each task satisfies its deadline under resource constraints [15]. In
production management area, fuzzy rules were applied to job shop and shop floor scheduling [17,21].
Kaviani and Vranesic used fuzzy rules to determine the appropriate number of processors for a given
set of tasks and deadlines for real-time systems [13]. Likewise, fuzzy calculus was applied to real-time
scheduling in [19,20]. Soma and others considered the schedule optimization based on fuzzy inference
engine [18]. Their approach, however, do not take into account the fact that an execution time of each
job can be imprecise.
In data-flow analysis, an iterative code segment can be modeled as a cyclic data-flow graph (DFG)
or task graph where a set of nodes represents computation tasks and edges represents dependencies
(precedence relations) between such tasks. Each node is weighted according to its computation time,
i.e., the total time required to execute a task. The dependency distances or delays, between tasks in
different iterations, can be represented by bar lines on the edges. Considerable research has been
conducted in the area of scheduling nodes from directed-acyclic graphs (DAGs), a simpler version
of the cyclic data flow graph. Many heuristics have been proposed, e.g., list scheduling, and graph
2
![Page 3: 04.03 Imprecise Task Schedule Optimization [I].pdf](https://reader031.vdocuments.site/reader031/viewer/2022021405/577cc0951a28aba7119090f6/html5/thumbnails/3.jpg)
8/10/2019 04.03 Imprecise Task Schedule Optimization [I].pdf
http://slidepdf.com/reader/full/0403-imprecise-task-schedule-optimization-ipdf 3/11
![Page 4: 04.03 Imprecise Task Schedule Optimization [I].pdf](https://reader031.vdocuments.site/reader031/viewer/2022021405/577cc0951a28aba7119090f6/html5/thumbnails/4.jpg)
8/10/2019 04.03 Imprecise Task Schedule Optimization [I].pdf
http://slidepdf.com/reader/full/0403-imprecise-task-schedule-optimization-ipdf 4/11
or equal to zero.
Figure 1(a) illustrates the task graph , representing a set of tasks represented by vertices
and . The set of edges contains , , , , , and
. The computation time of each node is assumed to be crisp-valued. Each node in this graph takes 1
time unit except node which takes two time units to execute. The delays of the edges are ,
and , for . The direct edge from to , conveys that must be executed before
since may need data produced by at the current iteration and so on. An edge from to with
two bar lines (delays) conveys that requires data produced by in the two previous iterations (see
Figure 1(b)). This graph represents only one iteration of the execution pattern. This pattern is repeated
for iterations as shown in Figure 1(b). Figure 1(c) shows a possible legal static schedule for this
graph where the target system is assumed to consist of two homogeneous processing elements. In this
schedule, the total execution time of this graph is 4 time units. Since takes 2 time units to finish,
cannot start the execution until time step 4.
A
DB
C
1 1
12
(a) Task graph
A
DB
CA
DB
C A
DB
C A
DB
C
n Iterations
(b) Repeated execution
Time PE1 PE2
1 A
2 B C
3 B
4 D
(c) Static schedule
Figure 1: Task graph, repeated execution pattern, and its schedule
A computation time for each node in the graph is modeled by a fuzzy number. We assume that
a fuzzy number is normal triangular-shaped as presented in Figure 2(a). Suppose the computation
time of node is assigned as shown in this figure with the confidence interval . The most
possible computation time of node is 4 since its confidence level or presumption level is 1. Similarly,
Figure 2(b) shows the fuzzy number with the confidence interval representing the execution time
of .
42 6 x A
µ(x)
1
(a)
1
3 5 7 By
µ(y)
(b)
1
5 987 10 11 12A+B
z
µ(z)
(c)
Figure 2: Two fuzzy numbers representing computation time of nodes and
4
![Page 5: 04.03 Imprecise Task Schedule Optimization [I].pdf](https://reader031.vdocuments.site/reader031/viewer/2022021405/577cc0951a28aba7119090f6/html5/thumbnails/5.jpg)
8/10/2019 04.03 Imprecise Task Schedule Optimization [I].pdf
http://slidepdf.com/reader/full/0403-imprecise-task-schedule-optimization-ipdf 5/11
In order to calculate the addition of two fuzzy numbers, the following equation is used [7]:
(1)
where and denote and operations respectively. Similar idea can also be applied to some
other operations such as subtraction, multiplication, maximum, minimum, etc. Figure 2(c) demon-
strates a graphical result of adding two fuzzy numbers from Figures 2(a) –2(b).
Rotation scheduling
Since the rotation scheduling implicitly retimes a node in a task-graph, we first briefly review the retim-
ing concepts. Intuitively, by using retiming, the placement of the delays in the graph (or dependency
distance between iterations) can be rearranged in such a way that the graph will produce the shorter
total execution time in one iteration and the characteristic of the graph is preserved. That is a shorterstatic schedule can be obtained by retiming “some” nodes. The retiming of a node moves one delay
from all of its incoming edges to all of its outgoing edges. In other words, a number of delays for all
of its incoming edges are decremented by one and one delay is added to every outgoing edges. A node
can legally be retimed (without changing the graph behavior) if all its incoming edges contain at least
one delay1.
A
DB
C
1 1
12
(a) Retime
Time PE1 PE2
1 A
2 B C
3 B
4 D
5 A
6 B C
7 B
8 D
.
.
.
.
.
.
(b) Original
Time PE1 PE2
1 A
2 B C
3 B
4 D
5 A
6 B C
7 B
8 D
9 A
.
.
.
.
.
.
.
.
.
(c) After shifted
Time PE1 PE2
- A
1 B C2 B A3 D
(d) Reposition
Figure 3: Retiming and scheduling
In Figure 1(a), node can be retimed. If is retimed once, . Figure 3(a) shows the
retimed graph after has been retimed. A delay is moved from edge to edges ,
and . Nevertheless, applying retiming to nodes in the graph and producing a static
1One of the properties that makes the retiming approach preserve the graph behavior is that the number of delays in any
cycle in the graph is always constant. Details of retiming and its properties can be found in [16].
5
![Page 6: 04.03 Imprecise Task Schedule Optimization [I].pdf](https://reader031.vdocuments.site/reader031/viewer/2022021405/577cc0951a28aba7119090f6/html5/thumbnails/6.jpg)
8/10/2019 04.03 Imprecise Task Schedule Optimization [I].pdf
http://slidepdf.com/reader/full/0403-imprecise-task-schedule-optimization-ipdf 6/11
schedule from the retimed graph sometimes does not yield a good static schedule under resource con-
straints. Rotation scheduling systematically includes the notion of retiming while considering resource
constraints [1]. We adopt the algorithm and extend its principles in order to schedule and optimize an
imprecise task graph.
Consider the example in Figure 1(a). Assume that the target system consists of 2 processors. Fig-
ure 3(b) shows an initial repeated execution pattern obtained from Figure 1(c). The rotation scheduling
algorithm first selects nodes which can legally be retimed, e.g., for this case is node . Retiming such
a node is equivalent to shifting iteration boundary down as shown in Figure 3(c). Node in the first
iteration becomes a prologue instruction while task in the next iteration is moved to the current iter-
ation. A prologue is the set of instructions that must be executed to provide the necessary data before
entering the iterative process. In the new retimed graph 3(a), node has no direct dependency to any
node in the current iteration since it produces data for and in the next iteration. Any available
position is legal for placing node . Figure 3(d) shows the schedule where is assigned after . This
resulting schedule, then, has the total execution time of the graph smaller than the previous one.
3 Algorithms
Once the computation time of a task is a fuzzy number, two issues should be considered: how to
determine the “best” place to schedule a node and how to determine whether a new schedule is better
than the old one. Therefore, the traditional scheduling algorithm has to be extended in order to handle
fuzzy situations. In our implementation, we use the conventional scheduling heuristic which attempts
to place a task at the time slot where a processor is idle, i.e., waiting for data needed to execute the
successive task. In order to do so, a task in a schedule is assigned a time step or control step (cs),
e.g., a legal earliest starting time that the task can begin execution. After that, the available time slot is
calculated by the difference between two successive tasks in the same processor.
As an example, Figure 4(a) diagrams a task graph where the computation time of each node is not
specified. Assume that two processing elements are available. The execution order of this graph can
be presented in Figure 4(b). Since requires data produced by both and , it needs to wait until
finishes in order to start the execution. Hence, PE has an empty time slot. The size of the available
time slot between tasks and needs to be estimated so that the scheduler can decide where to place
a new task that has no data dependency to any other tasks. When there are more than one empty
time slots in a processor, the biggest one will be chosen. Based on the rotation framework mentioned
in the previous section, a task can also be rescheduled when the iteration boundary is shifted. For
simplicity, the computation of each node is defuzzified using the weighted average method. After that,
the defuzzified time step can be assigned to each node in the schedule. Then the size of the free time
6
![Page 7: 04.03 Imprecise Task Schedule Optimization [I].pdf](https://reader031.vdocuments.site/reader031/viewer/2022021405/577cc0951a28aba7119090f6/html5/thumbnails/7.jpg)
8/10/2019 04.03 Imprecise Task Schedule Optimization [I].pdf
http://slidepdf.com/reader/full/0403-imprecise-task-schedule-optimization-ipdf 7/11
slot for each processor is computed as usual. Although such a method does not take advantages of
fuzzy nature, it is easy to implement and yields a comparable approximation. Our future work will
include how to compute fuzzy time steps and use fuzzy nature to decide the most likely position to
schedule a task.
B
AD
C
(a) Task graph
PE1 PE2
(b) Schedule
v
u ui2u1
max (u.t)ii
(c) Max. operation
Figure 4: Another task graph, its execution order, and maximum operation
Secondly, we use fuzzy arithmetic in order to compute the size of a schedule, i.e., the total exe-
cution time of the graph. If node depends on more than one parent , e.g., , fuzzy
maximum of all the computation time of its parents are computed ( ) (see Figure 4(c)). Then,
the result is added to the computation time of the node , i.e., , via the fuzzy addition op-
eration. After the overall computation time of the graph is computed, the fuzzy schedule length is used
to determine whether such processor assignments are good. In order to compare the effectiveness, the
defuzzified crisp schedule length is recorded and the best schedule which results in the shortest execu-
tion time is returned. The following presents the overall scheduling optimization algorithm. Procedure
initiates the first processor assignment which is a tuple . is a
function mapping from a node to its processor and is a function mapping from a node to its suc-
cessor in processor . The optimized schedule can be viewed by combining with the resulting task
graph . This procedure is based on the list-scheduling approach [5]. Lines 2 –6 begins the schedule
optimization phase, rotation iterations. It returns a new legal task graph with the same behavior and
the processor assignments. Function selects a legal task to retime and updates the task
graph. Function finds a new legal place for a node according to the task graph and the
Algorithm 3.1 (Optimization)
Input: A fuzzy task graph , # processors,
heuristic to reschedule a node
Output: A processor assignment ,initially, NULL and
1 // create initial schedule assignment
2 for to do
3
4 // down-rotate a legal node from
5
6 if then end do
7 return
7
![Page 8: 04.03 Imprecise Task Schedule Optimization [I].pdf](https://reader031.vdocuments.site/reader031/viewer/2022021405/577cc0951a28aba7119090f6/html5/thumbnails/8.jpg)
8/10/2019 04.03 Imprecise Task Schedule Optimization [I].pdf
http://slidepdf.com/reader/full/0403-imprecise-task-schedule-optimization-ipdf 8/11
given scheduling heuristic . Note that during the optimization phase, the algorithm only needs to
find a place for a rescheduled node . After is retimed, may only require data produced by some
nodes but does not give data to any other node. Hence, only can be rescheduled while other nodes
remain at the same position in the schedule. Recall that the scheduling problem is NP-hard [6]. A lot of
computation would be required if all nodes were rescheduled. At each iteration of rotation, Procedurecompares the new schedule to the reference one while the better schedule is saved for the future
reference.
Algorithm 3.2 (Init Schedule)
Input: , # PEs,cur. schedule
Output: A schedule
1 where edges with delays
2 vert ices in with no incoming edges // roots
3 while empty do
4 mark scheduled
5 if NULL
6 then assign at PE 7 else NULL
8 foreach PE PE to PE do
9
10 assign after the previous node at PE
11 if then end do
12
13 foreach do
14
15 if then end do end do
16 return
Now, let us consider the initial scheduling process and schedule comparison in more details. Since
the non-zero delay edges come from previous iteration(s), Algorithm 3.2 first ignores the edges with
delays and constructs a DAG out of the input graph. Then, nodes without incoming edges (or root-
nodes) are scheduled one by one (Line 2). In Lines 8 –12, the algorithm assigns a node, with respect
to its ancestor(s), to every processor and checks which assignment gives the smallest schedule size.
Again, in our algorithm procedure is used to compute and the size of the old and the new
schedule are compared. Function counts the number of parents of . A node can be
scheduled next if all parents of a node are scheduled or . After all nodes are scheduled,
the set of processor assignments are returned.
Algorithm 3.3 considers a sub-graph (DAG) consisting of the scheduled nodes. It updates
by adding a sink node and its zero-delay edges connecting to all nodes with no outgoing edges (or
leaf-nodes). It also adds extra edges corresponding to the successor values in Line 3. Node
is used as a reference to compute the total execution time of the graph. A temporary variable
is initialized to zero. For each set of assignments (Line 4), the total schedule length is calculated by
traversing the graph in topological order [4]. The fuzzy computation time of each node is added to its
temporary variable in Line 9. This variable, actually, keeps the current fuzzy maximum computation
time from roots to all of its parents (Line 11). A node is enqueued if all of its parents have been
8
![Page 9: 04.03 Imprecise Task Schedule Optimization [I].pdf](https://reader031.vdocuments.site/reader031/viewer/2022021405/577cc0951a28aba7119090f6/html5/thumbnails/9.jpg)
8/10/2019 04.03 Imprecise Task Schedule Optimization [I].pdf
http://slidepdf.com/reader/full/0403-imprecise-task-schedule-optimization-ipdf 9/11
Algorithm 3.3 (Better)
Input: , schedules
Output: 1 if is better than , 0 otherwise.
1 where unscheduled nodes
2 // joint node
3 PE
4 foreach assignment to do
5 // init. sum. var.
6 vertices in with no incoming edges // roots
7 while empty do
8
9 // fuzzy add. to comp. time
10 foreach do
11
12
13 if then end do end do
14 length end do
15 return l ength l ength
visited (Lines 12 –13). After the last node ( ) is considered, contains the total length of such
an assignment. Both s obtained from both and are compared and the boolean value is
returned in Line 15.
4 Results
We have experimented the algorithms on different benchmarks in digital signal processing areas. The
target systems consist of 3 functional units (2 adders, 1 multiplier) and 4 functional units (2 adders, 2
multipliers). In this experiment, the computation time of adders and multipliers were obtained from [9].
The confidence interval of the execution time of the adder is ns. with the typical value ns.
while the confidence interval of the execution time of the multiplier is ns. with the typical
value ns. The following table presents some numerical results. Columns start. and rot. show the
initial defuzzified schedule length and the schedule length after the optimization is applied respectively.
Column red. shows the percent of reduction after the optimization algorithm is applied which presents
the effectiveness of the proposed method.
Benchmarks # tasks 2 adders, 1 mult. 2 adders, 2 mults.
start. rot. red. start. rot. red.
Diff. Equation 11 158 121 23% 111 78 29%
3-stage direct IIR Filter 12 174 141 19% 111 78 30%
All-pole Lattice Filter 15 220 142 36% 220 142 36%
Volterra Filter 27 519 352 32% 349 225 36%
Elliptic Filter 34 310 281 10% 278 266 4%
All-pole Lattice (uf=2) [1] 45 488 410 16% 478 405 15%
Table 1: Experimental results on 3 functional units and 2 functional units
9
![Page 10: 04.03 Imprecise Task Schedule Optimization [I].pdf](https://reader031.vdocuments.site/reader031/viewer/2022021405/577cc0951a28aba7119090f6/html5/thumbnails/10.jpg)
8/10/2019 04.03 Imprecise Task Schedule Optimization [I].pdf
http://slidepdf.com/reader/full/0403-imprecise-task-schedule-optimization-ipdf 10/11
5 Conclusion
Frequently, uncertainty occurs when estimating the computation time of a task which gives dif ficulties
in deriving a good static task schedule for high-level synthesis. In this paper, a triangular fuzzy number
is used to express these uncertainties. This representation is simple and easy to approximate. Fuzzyarithmetic can be applied to compute the overall schedule length and determine the best place to sched-
ule. Such calculation is very effective and time saving when compared with probability computation.
Rotation scheduling algorithm is applied to the initial processor assignment in order to reduce the pos-
sible schedule size while considering resource constraints. The experiments show that the algorithm
can ef ficiently optimize a schedule.
As an initial step of this research, in the scheduling heuristic, fuzzy values are defuzzifed into
crisp values which, however, does not explore the fuzzy nature. We are currently generalizing the
use of fuzzy arithmetic to the schedule heuristic. Formal definitions about fuzzy time step and sched-
ule length are necessary. Methods to decide where to place a task have to be refined. Further, we
are also experimenting other approaches to compare two fuzzy schedule lengths which incorporate
fuzzy distance and confidence intervals. The comparison to the use of probability theory is also under
investigation.
References
[1] L. Chao. Scheduling and Behavioral Transformations for Parallel Systems. PhD thesis, Princeton
University, October 1993.
[2] L. Chao, A. LaPaugh, and E. Sha. Rotation scheduling: A loop pipelining algorithm. In Proceed-
ings of the 30th Design Automation Conference, pages 566 –572, Dallas, TX, June 1993.
[3] L. Chao and E. Sha. Static scheduling for synthesis of DSP algorithms on various models. Journal
of VLSI Signal Processing, pages 207 –223, October 1995.
[4] T. H. Cormen, C. E. Leiserson, and R. L. Rivest. Introduction to Algorithms. The MIT Electrical
Engineering and Computer Science Series. McGraw-Hill Book Company, New York, 1990.
[5] El-Revini, Lewis, and Ali. Task scheduling in parallel and distributed systems. Prentice-Hall,
1994.
[6] M. R. Garey and D. S. Johnson. Computers and Intractability: A Guide to the Theory of NP-
Completeness. W. H. Freeman and Company, New York, 1979.
[7] K. Gupta. Introduction to Fuzzy Arithmetic. 1 edition, 1984.
10
![Page 11: 04.03 Imprecise Task Schedule Optimization [I].pdf](https://reader031.vdocuments.site/reader031/viewer/2022021405/577cc0951a28aba7119090f6/html5/thumbnails/11.jpg)
8/10/2019 04.03 Imprecise Task Schedule Optimization [I].pdf
http://slidepdf.com/reader/full/0403-imprecise-task-schedule-optimization-ipdf 11/11
[8] O. Hammmami. Fuzzy scheduling in compiler optimizations. In Proceedings of the ISUMA-
NAFIPS , 1995.
[9] Texas Instruments. The TTL data book , volume 2. Texas Instruments Incorporation, 1985.
[10] R. A. Kamin, G. B. Adams, and P. K. Dubey. Dynamic list-scheduling with finite resources. InProceedings of the 1994 International Conference on Computer Design, pages 140 –144, Cam-
bridge, MA, October 1994.
[11] I. Karkowski. Architectural synthesis with possibilistic programming. In HICSS-28, January 95.
[12] I. Karkowski and R. H. J. M. Otten. Retiming synchronous circuitry with imprecise delays. In
Proceedings of the 32nd Design Automation Conference, pages 322 –326, San Francisco, CA,
1995.
[13] A. S. Kaviani and Z. G. Vranesic. On scheduling in multiprocess systems using fuzzy logic. InProceedings of the International Symposium on Multiple-valued Logic, pages 141 –147, 1994.
[14] A. A. Khan, C. L. McCreary, and M. S. Jones. A comparison of multiprocessor scheduling heuris-
tics. In Proceedings of the 1994 International Conference on Parallel Processing, volume II,
pages 243 –250, 1994.
[15] J. Lee, A. Tiao, and J. Yen. A fuzzy rule-based approach to real-time scheduling. In Proceedings
of the International Conference on Fuzzy Systems, volume 2, 1994.
[16] C. E. Leiserson and J. B. Saxe. Retiming synchronous circuitry. Algorithmica, 6:5 –35, 1991.
[17] K. Mertins et al. Set-up scheduling by fuzzy logic. In Proceedings of the International conference
on computer integrated manufactoring and automation technology, pages 345 –350, 1994.
[18] H. Soma, M. Hori, and T. Sogou. Schedule optimization using fuzzy inference. In Proceedings
of the International Conference on Fuzzy Systems, pages 1171 –1176, 1995.
[19] F. Terrier and Z. Chen. Fuzzy calculus applied to real-time scheduling. In Proceedings of the In-
ternational Conference on Fuzzy Systems, pages 1905 –1910, 1994.
[20] F. Terrier, L. Rioux, and Z. Chen. Real time scheduling under uncertainty. In Proceedings of the International Conference on Fuzzy Systems, pages 1177 –1184, 1995.
[21] I.B. Turksen et al. Fuzzy expert system shell for scheduling. SPIE , pages 308 –319, 1993.
[22] L. A. Zadeh. Fuzzy sets as a basis for a theory of possibility. Fuzzy Sets and Systems, 1:3 –28,
1978.
11