data placement and task scheduling in cloud, online and offline 2014.11.27 赵青 天津科技大学...
TRANSCRIPT
![Page 1: Data Placement and Task Scheduling in cloud, Online and Offline 2014.11.27 赵青 天津科技大学 zhaoqingtj@tust.edu.cn](https://reader036.vdocuments.site/reader036/viewer/2022081419/56649ee55503460f94bf40de/html5/thumbnails/1.jpg)
Data Placement and Task Scheduling in cloud, Online and Offline
2014.11.27
赵青天津科技大学[email protected]
![Page 2: Data Placement and Task Scheduling in cloud, Online and Offline 2014.11.27 赵青 天津科技大学 zhaoqingtj@tust.edu.cn](https://reader036.vdocuments.site/reader036/viewer/2022081419/56649ee55503460f94bf40de/html5/thumbnails/2.jpg)
Motivation
● Increase the corresponding speed and throughput
● Guarantee QoS
● Energy Efficient and Green Computing
![Page 3: Data Placement and Task Scheduling in cloud, Online and Offline 2014.11.27 赵青 天津科技大学 zhaoqingtj@tust.edu.cn](https://reader036.vdocuments.site/reader036/viewer/2022081419/56649ee55503460f94bf40de/html5/thumbnails/3.jpg)
Overview
● Data placement for data-intensive application
● Task scheduling for QoS and energy efficiency
● Online task scheduling
![Page 4: Data Placement and Task Scheduling in cloud, Online and Offline 2014.11.27 赵青 天津科技大学 zhaoqingtj@tust.edu.cn](https://reader036.vdocuments.site/reader036/viewer/2022081419/56649ee55503460f94bf40de/html5/thumbnails/4.jpg)
1. Data Placement for data-intensive application
● Data clustering based on data correlation
if put every 2at different nodes?
how much data transfer amount would be in-creased
BEA
Hierarchical clustering tree
Objective:Place the close-related data items together so as to decrease data transfersContributions: 1. Introduced data size factors2. Issued “First Order Conduction
Correlation” from intermediate data
![Page 5: Data Placement and Task Scheduling in cloud, Online and Offline 2014.11.27 赵青 天津科技大学 zhaoqingtj@tust.edu.cn](https://reader036.vdocuments.site/reader036/viewer/2022081419/56649ee55503460f94bf40de/html5/thumbnails/5.jpg)
● Data distribution
Storage capacity, computation load balance “Tree-to-Tree” greedy allocation strategy
Modified PSO algorithm
1. Data Placement for data-intensive application
● Cloud platform modeling
Physical network structure/ BEA
Objective:Make the frequent data movements happene on high-speed channels so as to improve network utilization and the efficiency of the whole cloud system.
![Page 6: Data Placement and Task Scheduling in cloud, Online and Offline 2014.11.27 赵青 天津科技大学 zhaoqingtj@tust.edu.cn](https://reader036.vdocuments.site/reader036/viewer/2022081419/56649ee55503460f94bf40de/html5/thumbnails/6.jpg)
1. Data Placement for data-intensive application
● Runtime data placement— Newly generated datasets will be saved to the data center which has
the maximum dependency with it— The cost of re-distribution itself will also be taken into account.
● Results: by the greedy allocation strategies
10% 20% 30% 50%1800
1900
2000
2100
2200
2300
2400
2500
No.3 strategy (without runtime algorithm)
No.4 strategy (with runtime algorithm)
DongYuan's strategy
No.5 strategy
prediction error rate
Tota
l D
ata
Movem
ent
Am
ount
10% 20% 30% 50%1800
1900
2000
2100
2200
2300
2400
2500
No.3 strategy (without runtime algorithm)
No.4 strategy (with runtime algorithm)
DongYuan's strategy
No.5 strategy
prediction error rate
Tota
l Tim
e C
onsum
ed b
y M
ovem
ent
![Page 7: Data Placement and Task Scheduling in cloud, Online and Offline 2014.11.27 赵青 天津科技大学 zhaoqingtj@tust.edu.cn](https://reader036.vdocuments.site/reader036/viewer/2022081419/56649ee55503460f94bf40de/html5/thumbnails/7.jpg)
2. Task Scheduling and Virtual Machine Allocation
● Objective:— Distribute the tasks with strong data dependences to the servers on a
high-bandwidth connection, and turn off some of these servers with low utilization
— Therefore:• the response time can be reduced
• the utilization of system wide can be improved
• some idle network devices can also be turned off
● Task Clustering by— Hypergraph partitioning— BEA Transformation
Efficient & Energy Saving!
![Page 8: Data Placement and Task Scheduling in cloud, Online and Offline 2014.11.27 赵青 天津科技大学 zhaoqingtj@tust.edu.cn](https://reader036.vdocuments.site/reader036/viewer/2022081419/56649ee55503460f94bf40de/html5/thumbnails/8.jpg)
2. Task Scheduling and Virtual Machine Allocation
● Requirement of tasks— Storage requirement— Computing Resource requirement: represented by VMs.
● Task Scheduling and Deadline constraint:
:)1( mxVM x ),( xmem
xcpu VCVC
),,(:)1( ideadline
iii TWCETVMnit
Decrease the number of VMs as much as possible, while ensuring users’ Service Level Agreements.
![Page 9: Data Placement and Task Scheduling in cloud, Online and Offline 2014.11.27 赵青 天津科技大学 zhaoqingtj@tust.edu.cn](https://reader036.vdocuments.site/reader036/viewer/2022081419/56649ee55503460f94bf40de/html5/thumbnails/9.jpg)
2. Task Scheduling and Virtual Machine Allocation
● Physical machine allocation— Optimization objective: energy efficiency, high-bandwidth networks,
load balance
— Greedy Strategy: • Each server’s energy efficiency
• TRD (Task Requirement Degree)
• Top-Down & Bottom-up: reduce data transfers, and improve network utilization
• Load balance
— Constraint conditions: storage capacity, CPU and memory constraints— Other Methods: Genetic algorithms, PSO algorithms
Optimal utilization level in terms of performance-per-watt:Commonly,
yOpt%70yOpt
yyy
yyyx OptUtilUtil
OptUtilorUtilTRD
0,
0,0
%1001
x
CPU
m
x
xCPU
yx
y C
RQUtil
![Page 10: Data Placement and Task Scheduling in cloud, Online and Offline 2014.11.27 赵青 天津科技大学 zhaoqingtj@tust.edu.cn](https://reader036.vdocuments.site/reader036/viewer/2022081419/56649ee55503460f94bf40de/html5/thumbnails/10.jpg)
3. Online Scheduling
● Problems:— How to schedule the tasks in a fine-grained workflow?— How to deal with some variable conditions at runtime?
● Reinforcement learning based methods
T
t tttt sasrhR
1 11 ),,()(
dhhRhpJ )()|()(
),|(),|()()|( 111 tttttTt saasspsphp
The goal of RL is to find the optimal pol-icy parameter
)(maxarg* J
Agent Environment
State s
Action a
Reward r
![Page 11: Data Placement and Task Scheduling in cloud, Online and Offline 2014.11.27 赵青 天津科技大学 zhaoqingtj@tust.edu.cn](https://reader036.vdocuments.site/reader036/viewer/2022081419/56649ee55503460f94bf40de/html5/thumbnails/11.jpg)
3. Online Scheduling
Example: Cart-Pole Swing-up
● Task: swing up the pole by moving the cart
● State (2-D continuous): angle , and velocity of the pole
● Action (1-D continuous): force applied to cart
● Reward:
]2,0[ ]3,3[
)cos(),,( 11 tttt sasr
![Page 12: Data Placement and Task Scheduling in cloud, Online and Offline 2014.11.27 赵青 天津科技大学 zhaoqingtj@tust.edu.cn](https://reader036.vdocuments.site/reader036/viewer/2022081419/56649ee55503460f94bf40de/html5/thumbnails/12.jpg)
Thank for your time!