lawrence livermore national laboratory practical resource …tpatki/hpdctalk.pdf · 2015-06-18 ·...

23
Practical Resource Management in Power-Constrained, High Performance Computing Tapasya Patki*, David Lowenthal, Anjana Sasidharan, Matthias Maiterth, Barry Rountree, Martin Schulz, Bronis R. de Supinski June 18, 2015 Lawrence Livermore National Laboratory

Upload: others

Post on 03-May-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Lawrence Livermore National Laboratory Practical Resource …tpatki/HPDCTalk.pdf · 2015-06-18 · Practical Resource Management in Power-Constrained, High Performance Computing Tapasya

Practical Resource Management in Power-Constrained, High

Performance Computing

Tapasya Patki*, David Lowenthal, Anjana Sasidharan, Matthias Maiterth, Barry Rountree, Martin Schulz,

Bronis R. de Supinski

June 18, 2015

Lawrence Livermore National Laboratory

Page 2: Lawrence Livermore National Laboratory Practical Resource …tpatki/HPDCTalk.pdf · 2015-06-18 · Practical Resource Management in Power-Constrained, High Performance Computing Tapasya

The Power Problem

40% Procured Power Unused!

Not used

Lawrence Livermore National Laboratory

Page 3: Lawrence Livermore National Laboratory Practical Resource …tpatki/HPDCTalk.pdf · 2015-06-18 · Practical Resource Management in Power-Constrained, High Performance Computing Tapasya

Problem is power utilization, not just power procurement

Me

gaw

atts

Pow

er U

tiliz

atio

n

0%

20

10

0%

0

5% Desirable range

Do Not:

“Save” Power

Do : Use power to improve performance and system throughput

Accomplish more science

Exascale Power Problem Lawrence Livermore National Laboratory

Page 4: Lawrence Livermore National Laboratory Practical Resource …tpatki/HPDCTalk.pdf · 2015-06-18 · Practical Resource Management in Power-Constrained, High Performance Computing Tapasya

Hardware Overprovisioning

Worst-case provisioning (traditional): All nodes can run at peak power simultaneously

Hardware overprovisioning: • Buy more capacity, limit power per node • Reconfigure dynamically based on application’s memory

and scalability characteristics • Moldable applications: flexible in terms of node and core

counts on which they can execute

Lawrence Livermore National Laboratory

Page 5: Lawrence Livermore National Laboratory Practical Resource …tpatki/HPDCTalk.pdf · 2015-06-18 · Practical Resource Management in Power-Constrained, High Performance Computing Tapasya

Configurations

Good performance relies on choosing an application-specific configuration • Number of nodes, cores per node and power per

node, (n x c, p) In our case, improves performance under a power bound by 32% (1.47x) on average compared to worst-case provisioning

Lawrence Livermore National Laboratory

Page 6: Lawrence Livermore National Laboratory Practical Resource …tpatki/HPDCTalk.pdf · 2015-06-18 · Practical Resource Management in Power-Constrained, High Performance Computing Tapasya

Hardware Overprovisioning Example

Lawrence Livermore National Laboratory

Configuration

(n x c, p) Time (s)

Total Power: CPU & DRAM (W)

Worst-case (20 x 16, 115) 9.10 3250

Overprovisioned (26 x 12, 80) 3.65 3497

SP-MZ CFD kernel, Bound of 3500 W

2.5x Speedup Utilized all allocated power

CPU Power Cap

Page 7: Lawrence Livermore National Laboratory Practical Resource …tpatki/HPDCTalk.pdf · 2015-06-18 · Practical Resource Management in Power-Constrained, High Performance Computing Tapasya

Resource Management

What is the impact of overprovisioning when we have a real cluster with multiple users and

several jobs?

Can we utilize the procured power better and minimize wasted power?

Page 8: Lawrence Livermore National Laboratory Practical Resource …tpatki/HPDCTalk.pdf · 2015-06-18 · Practical Resource Management in Power-Constrained, High Performance Computing Tapasya

Power-Aware Scheduling Challenges

User: Users care about fairness and turnaround time

• Fair and transparent job-level power allocation • Minimize execution time, reduce queue wait time

System: Admins care about utilization and throughput • Maximize utilization of available nodes and power • Minimize average turnaround time for job queue

Lawrence Livermore National Laboratory

Page 9: Lawrence Livermore National Laboratory Practical Resource …tpatki/HPDCTalk.pdf · 2015-06-18 · Practical Resource Management in Power-Constrained, High Performance Computing Tapasya

Resource MAnager for Power (RMAP)

Aimed at future power-constrained systems

Implemented within SLURM

Novel Adaptive policy: • Uses overprovisioning and power-aware backfilling • Improves system power utilization and optimizes

execution time under a job-level power bound • Leads to 19% and 36% faster turnaround times than

baseline Traditional and Naïve

Lawrence Livermore National Laboratory

Page 10: Lawrence Livermore National Laboratory Practical Resource …tpatki/HPDCTalk.pdf · 2015-06-18 · Practical Resource Management in Power-Constrained, High Performance Computing Tapasya

First-Come First Serve Scheduling

1

N Current Job Queue

A

Time

No

des

B

No

des

Time now

N

1 R’’

R’ Unutilized

FCFS Policy

A B

Lawrence Livermore National Laboratory

Page 11: Lawrence Livermore National Laboratory Practical Resource …tpatki/HPDCTalk.pdf · 2015-06-18 · Practical Resource Management in Power-Constrained, High Performance Computing Tapasya

Backfilling N

od

es

Time now

N

1 R’’

R’

Backfilling

A B

No

des

Time now

N

1 R’’

R’ Unutilized

FCFS Policy

A B

Lawrence Livermore National Laboratory

Page 12: Lawrence Livermore National Laboratory Practical Resource …tpatki/HPDCTalk.pdf · 2015-06-18 · Practical Resource Management in Power-Constrained, High Performance Computing Tapasya

Insight: Power-Aware Backfilling

1

W Current Job Queue

A

Time

Pow

er

Pow

er

Time now

W

1 R’’

R’ Unutilized

FCFS Policy for Power

B

B A

Lawrence Livermore National Laboratory

Page 13: Lawrence Livermore National Laboratory Practical Resource …tpatki/HPDCTalk.pdf · 2015-06-18 · Practical Resource Management in Power-Constrained, High Performance Computing Tapasya

Insight: Power-Aware Backfilling Po

wer

Time now

W

1 R’’

R’ Unutilized

FCFS Policy for Power

A B

Pow

er

Time now

W

1 R’’

R’

Power-Aware Backfilling

B A Moldable

Lawrence Livermore National Laboratory

Page 14: Lawrence Livermore National Laboratory Practical Resource …tpatki/HPDCTalk.pdf · 2015-06-18 · Practical Resource Management in Power-Constrained, High Performance Computing Tapasya

RMAP Policies: Inputs

Users request nodes and time Job-level power bound: • Fairly allocate power to each job based on the

fraction of total nodes requested

We assume equal priority.

Lawrence Livermore National Laboratory

Page 15: Lawrence Livermore National Laboratory Practical Resource …tpatki/HPDCTalk.pdf · 2015-06-18 · Practical Resource Management in Power-Constrained, High Performance Computing Tapasya

RMAP Adaptive Policy

• If enough power is available, allocate the best overprovisioned configuration under the derived job-level power bound

• Otherwise, allocate a suboptimal overprovisioned configuration with available power

• Users can specify an optional performance slowdown threshold for potentially faster turnaround times • Default is no slowdown (0%)

Lawrence Livermore National Laboratory

Page 16: Lawrence Livermore National Laboratory Practical Resource …tpatki/HPDCTalk.pdf · 2015-06-18 · Practical Resource Management in Power-Constrained, High Performance Computing Tapasya

RMAP Baseline Policies

Policy Description

Traditional Not fair-share, allocates requested nodes with all cores at full power

Naïve Greedily allocates best performing configuration under derived job-level power bound

*All policies use basic node-level backfilling.

Lawrence Livermore National Laboratory

Page 17: Lawrence Livermore National Laboratory Practical Resource …tpatki/HPDCTalk.pdf · 2015-06-18 · Practical Resource Management in Power-Constrained, High Performance Computing Tapasya

Experimental Details

Intel Sandy Bridge 64-node cluster • 2 sockets per node, 8 cores per socket • Min: 51 W, Max: 115 W

Intel RAPL for power measurement and control*

Moldable Applications • SPhot , NAS-MZ (BT-MZ, SP-MZ and LU-MZ) • Four synthetic

*DRAM power unavailable

Lawrence Livermore National Laboratory

Page 18: Lawrence Livermore National Laboratory Practical Resource …tpatki/HPDCTalk.pdf · 2015-06-18 · Practical Resource Management in Power-Constrained, High Performance Computing Tapasya

Evaluation

• SLURM Simulator, 64 nodes, 30 jobs per trace

• 5 global power bounds • 6500 W, extremely constrained • 14000 W, unconstrained

• Poisson process for dynamic job arrival

Lawrence Livermore National Laboratory

Page 19: Lawrence Livermore National Laboratory Practical Resource …tpatki/HPDCTalk.pdf · 2015-06-18 · Practical Resource Management in Power-Constrained, High Performance Computing Tapasya

Random Trace Results

On average Adaptive with no slowdown does 19% better than Traditional, 36% better than Naïve

Worst-case Provisioning

13%

31%

Lawrence Livermore National Laboratory

Page 20: Lawrence Livermore National Laboratory Practical Resource …tpatki/HPDCTalk.pdf · 2015-06-18 · Practical Resource Management in Power-Constrained, High Performance Computing Tapasya

Detailed Results: 6500 W Bound

Policy Average Turnaround Time (s)

Traditional 684

Naïve 990

Adaptive, 0% 636 (7% better than Traditional)

Adaptive, 20% 536 (21% better than Traditional)

Extremely power-limited, 128 processors, 50 W per socket with fair-share

Each job requests at least 40 nodes

Lawrence Livermore National Laboratory

Page 21: Lawrence Livermore National Laboratory Practical Resource …tpatki/HPDCTalk.pdf · 2015-06-18 · Practical Resource Management in Power-Constrained, High Performance Computing Tapasya

Detailed Results: 6500 W Bound

• 22 of 30 jobs have faster turnaround times than Traditional

• 21% faster turnaround time (on average)

Trad.

Adaptive, 0% Adaptive, 20%

Ave

rage

Tu

rnar

ou

nd

Ti

me

(s)

Job IDs

Lawrence Livermore National Laboratory

Page 22: Lawrence Livermore National Laboratory Practical Resource …tpatki/HPDCTalk.pdf · 2015-06-18 · Practical Resource Management in Power-Constrained, High Performance Computing Tapasya

RMAP Summary

Adaptive policy • Uses hardware overprovisioning and power-aware

backfilling • Leads to 19% and 36% faster queue turnaround times

than Traditional and Naïve • Improves individual application performance as well as

system throughput

Lawrence Livermore National Laboratory

Page 23: Lawrence Livermore National Laboratory Practical Resource …tpatki/HPDCTalk.pdf · 2015-06-18 · Practical Resource Management in Power-Constrained, High Performance Computing Tapasya

Acknowledgments

Lawrence Livermore National Laboratory

• Livermore Computing at Lawrence Livermore National Laboratory for MSR access

• Dr. Ghaleb Abdulla for help with Vulcan power data