hpc resource accounting

38
Ken Schumacher LISA 2014 - Seattle, WA. 12 November 2014 HPC Resource Accounting: Progress Against Allocation — Lessons Learned

Upload: ken-schumacher

Post on 02-Jul-2015

102 views

Category:

Data & Analytics


0 download

DESCRIPTION

My Invited Talk Presentation given at the LISA 2014 Conference in Seattle, WA, Nov 12, 2014. I describe some of the Lessons Learned while tracking usage of the HPC Computing Facilities of the Lattice Quantum Chromodynamics (LQCD) colaborators. The reports cover compute clusters at Fermi National Accelerator Laboratory.

TRANSCRIPT

Page 1: HPC Resource Accounting

Ken Schumacher!LISA 2014 - Seattle, WA.!12 November 2014

HPC Resource Accounting: Progress Against Allocation — Lessons Learned!

Page 2: HPC Resource Accounting

Ken Schumacher I HPC Resource Accounting: Progress Against Allocation—Lessons Learned 12 November 2014

My Backround in Batch System Accounting

In other words "Why should I listen to this guy?"!• I've been at Fermilab since Jan 1997, nearly 18 years!- I started supporting a few specific experiments. My group

managed Fermilab's central Unix cluster. I later moved to batch farms using LSF as the batch scheduler.!

- I was also a developer on the team that developed the first prototype for the Gartia Grid Accounting System (used by OSG, the Open Science Grid).!

• For the last 5 years I have been part of the HPPC Group administering several InfiniBand based HPC clusters.!- I generate weekly resource accounting reports!- I work with Principle Investigators (PIs) to manage allocations!- I monitor our compliance with SLOs and Fermilab policies

2

Page 3: HPC Resource Accounting

Ken Schumacher I HPC Resource Accounting: Progress Against Allocation—Lessons Learned 12 November 2014

Why call it Resource Accounting?• First: Resources!- Wikipedia - A source or supply from which benefit is produced!- Compute clusters offering unique resources, designed around the

needs of a particular group of researchers.!• LQCD Cluster - Actually 4 CPU and 2 GPU sub-clusters!• Cosmology Cluster !• Accelerator Modeling Cluster!

- Also offering shared on-line storage in our Lustre Clusters!- Access to offline storage service from DMS (Data Movement and

Stroage) department!- And the ever present staff as a resource. But accounting for staff

is outside the scope of my presentation.

3

Page 4: HPC Resource Accounting

Ken Schumacher I HPC Resource Accounting: Progress Against Allocation—Lessons Learned 12 November 2014

Why do I call it Resource Accounting?

• Next: Accounting !- Noun: a report or description of an event or experience: a detailed

account of what has been achieved.!• The Stake Holders that oversee and fund these collaborations and

their research need to know several things!• More than just how their money was spent but what it accomplished in

the form of:!• Availability/uptime of the computers, storage and services!• Usage by projects within the collaboration of the resources offered!• Papers and reports of progress on research being conducted.!

• The usage reports allow for budgeting and planning for future projects including new hardware acquisition

4

Page 5: HPC Resource Accounting

Ken Schumacher I HPC Resource Accounting: Progress Against Allocation—Lessons Learned 12 November 2014

Current State of the Reporting System

• The reporting tools that we use today are a work in progress!- Over the last four years there as been a great improvement in the

workflow of generating (automating) the weekly report!- The scope of the reporting has been revised as the requirements

have expanded!- There is a significant list of changes and improvements still

needed. !• I am here to share those things that became important (and

useful) as the scope of our reports expanded.!- We now include additional types of resources (GPU and storage)

as part of the allocation!- We added more detailed reporting of the usage by projects so we

can adjust both quotas and batch submission priorities

5

Page 6: HPC Resource Accounting

Ken Schumacher I HPC Resource Accounting: Progress Against Allocation—Lessons Learned 12 November 2014

Who are my customers?!• The HPPC department supports several Massively Parallel

Compute Clusters used by different groups!- Theoretical Particle Physicists associated with Lattice Quantum

Chromodynamics or LQCD.!• The users within this collaboration are from all over the world.!• The collaboration has compute resources at several institutions!

- Astrophysicists at Fermilab using our Cosmology Cluster!- Fermilab scientists, engineers and software developers doing

Accelerator Modeling on the AMR cluster

6

Ds Cluster !at FNAL

10q Cluster !at JLab

Page 7: HPC Resource Accounting

Ken Schumacher I HPC Resource Accounting: Progress Against Allocation—Lessons Learned 12 November 2014

Disclaimer: I am not a Theoretical Particle Physicist

7

Page 8: HPC Resource Accounting

Ken Schumacher I HPC Resource Accounting: Progress Against Allocation—Lessons Learned 12 November 2014

Why we need High Performance Computing (HPC) clusters

Discovered in the early 1970s, the theory of Quantum !chromodynamics (QCD) consists of equations that describe the!strong force that causes quarks to clump together to form !protons and other constituents !of matter. For a long time solving !these equations was a struggle. !But in the last decade using !powerful supercomputers !theorists are now able to finally !solve the equations of QCD !with high precision.

8

Page 9: HPC Resource Accounting

Ken Schumacher I HPC Resource Accounting: Progress Against Allocation—Lessons Learned 12 November 2014

Lattice Quantum Chromodynamics Computing at FNAL• Fermilab's LQCD Computing cluster is made up of a few sub-

clusters based on similar configurations!- Sub-clusters of conventional CPU based nodes!• Jpsi cluster - decommissioned May 2014, has been our standard for

normalized core-hours since 2010. 856 nodes, dual-socket quad-core Opteron 2352 (2.1 GHz) on DDR InfiniBand fabric!

• Ds cluster - 420 nodes with quad-socket eight-core Opteron 6128 (2.0 GHz) on QDR InfiniBand fabric. This is 13,440 cores.!

• Bc cluster - 224 nodes quad-socket eight-core Opteron 6320 (2.8 GHz) on QDR InfiniBand fabric. This is 7,168 cores.!

• Pi0 cluster - 214 node dual-socket eight-core Intel E5-2650v2 "Ivy Bridge" (2.6 GHz) on QDR InfiniBand fabric. This is 3,424 cores.!

( continued . . .)

9

Page 10: HPC Resource Accounting

Ken Schumacher I HPC Resource Accounting: Progress Against Allocation—Lessons Learned 12 November 2014

Lattice Quantum Chromodynamics Computing• Fermilab's LQCD Computing cluster (. . . continued)!- Sub-clusters of nodes enhanced with GPU processors!• Dsg cluster - 76 nodes with quad-socket eight-core Opteron 6128 (2.0

GHz) with two NVidia Tesla M2050 GPUs on QDR InfiniBand fabric. !• Pi0g cluster - 32 node dual-socket eight-core Intel E5-2650v2 "Ivy

Bridge" (2.6 GHz) with four NVidia Tesla K40m GPUs on QDR fabric.!- On-line Disk based storage in a Lustre Cluster!• LQCD Lustre Cluster has 964 TB of on-line storage after our most

recent expansion. !• Cosmology Lustre Cluster has 129 TB of on-line storage!

- Tape based storage in our SL8500 robotic tape libraries!• 1,617 LTO4 tapes (1,293.6 TB)!• 331 10KC tapes (1,655.0 TB)

10

Page 11: HPC Resource Accounting

Ken Schumacher I HPC Resource Accounting: Progress Against Allocation—Lessons Learned 12 November 2014

Other Compute Resources within USQCD• From the PY 2014-15 Call for Proposals!- Compute resources dedicated to Lattice QCD!• 71 M BG/Q core-hours at BNL!• 397 M Jpsi-core hours on clusters at FNAL and JLAB!• 8.9 M GPU-hours on GPU clusters at FNAL and JLAB!

- Compute resource awards to USQCD from the DOE's INCITE program!• 100 M Cray XK7 core-hours at Oak Ridge OLCF!• 240 M BG/Q core-hours at Argonne ALCF

11

Page 12: HPC Resource Accounting

Ken Schumacher I HPC Resource Accounting: Progress Against Allocation—Lessons Learned 12 November 2014

Allocations at FNAL/QCD

• USQCD Allocation Committee allocates time on the clusters in units of 'normalized core hours'.!• Program year is July 1 through June 30th!• Three classes of Allocations!- A - large allocations which to "support calculations of benefit for

the whole USQCD Collaboration and/or addressing critical scientific needs."!

- B - medium allocations (<2.5M c-h) "intended to support calculations in an early stage of development which address, or have the potential to address, scientific needs of the collaboration."!

- C - small and/or short term allocations, to explore / test / benchmark calculations with the potential to address scientific needs of the collaboration.

12

Page 13: HPC Resource Accounting

Ken Schumacher I HPC Resource Accounting: Progress Against Allocation—Lessons Learned 12 November 2014

Lesson 1 - We needed a normalized "currency"

• The allocation is like a budget. !• We base allocations on normalized core hours. !• Normalized core hours are basically our currency.!• CPU performance and GPU performance are like Apples and

Oranges.!• GPUs are designed for vector math or floating point calculations!• Some simulations rely more heavily on floating point.!• Code compiled for GPUs can not run on CPUs!• We had to develop new benchmarks for use with GPUs.!• So projects will get separate allocations for CPU and GPU

processing.

13

Page 14: HPC Resource Accounting

Ken Schumacher I HPC Resource Accounting: Progress Against Allocation—Lessons Learned 12 November 2014

Normalizing the USQCD Clusters at Fermilab• Our existing HPC clusters for LQCD!

» Ds cluster factor of 1.33 nC-H, 13,440 cores, 17,875 nCores!» Bc cluster factor of 1.48 nC-H, 7,168 cores, 10,609 nCores!» Pi0 cluster factor of 3.14 nC-H, 3,424 cores, 10,751 nCores!» Ds GPU cluster factor of 1.0 nC-H, 152 GPUs, 152 nGPUs!» Pi0 GPU cluster factor of 2.2 nC-H, 128 GPUs, 280 nGPUs!

• Storage: Tape at 3K nC-H per TB, Disk at 30K nC-H per TB!• PY July 1, 2014 through June 30, 2015 allocation!

» 270,375,000 CPU nC-H!» 2,815,000 GPU nC-H

14

Page 15: HPC Resource Accounting

Ken Schumacher I HPC Resource Accounting: Progress Against Allocation—Lessons Learned 12 November 2014

Sample Allocation Notification LetterHello Professor,!I am setting the Allocations and configuring accounts for the USQCD Scientific Program Committee allocations at Fermilab for the program year 2014-2015.  I have you listed as the PI contact for the following allocation.!Flavor Physics from B, D, and K Mesons on the 2+1+1-Flavor HISQ Ensembles!49.28 M cpu hours, 825 K gnu hours, 50 TB disk and 251 TB tape!If this does not match your information, please us know.  We need two things from you, please:!1) Your choice of a project name !2) A list of the users allowed to submit jobs to this project name

15

Page 16: HPC Resource Accounting

Ken Schumacher I HPC Resource Accounting: Progress Against Allocation—Lessons Learned 12 November 2014

FNAL Progress Against Allocation Reports

• We are using a series of homegrown PERL and Python scripts to generate our weekly progress against allocation report!- A summary of who used each sub-cluster!- A listing of specific credits and debits for the week!- A YTD summary of CPU cluster usage!- A YTD summary of GPU cluster usage!• We can include debits/credits against the allocation for several

reasons (explained later): !- Credits for reduced performance during load shed events!- Debits for storage: long-term (tapes) and on-line (disk)!- Credits for failed jobs (due to a systems failure)!- Debits for dedicated nodes set aside for a project

16

Page 17: HPC Resource Accounting

Ken Schumacher I HPC Resource Accounting: Progress Against Allocation—Lessons Learned 12 November 2014

Sample of a CPU Sub-cluster Weekly Detail Report

17

Page 18: HPC Resource Accounting

Ken Schumacher I HPC Resource Accounting: Progress Against Allocation—Lessons Learned 12 November 2014

Sample of GPU sub-cluster weekly detail report

18

Page 19: HPC Resource Accounting

Ken Schumacher I HPC Resource Accounting: Progress Against Allocation—Lessons Learned 12 November 2014

Sample of the Debits / Credits - Quarterly Tape Usage

19

Page 20: HPC Resource Accounting

Ken Schumacher I HPC Resource Accounting: Progress Against Allocation—Lessons Learned 12 November 201420

Page 21: HPC Resource Accounting

Ken Schumacher I HPC Resource Accounting: Progress Against Allocation—Lessons Learned 12 November 2014

Part I - The Report Header

• The Header describes where we are in the program year.!• It also provides some explanation of the numbers in the report

itself.

21

Page 22: HPC Resource Accounting

Ken Schumacher I HPC Resource Accounting: Progress Against Allocation—Lessons Learned 12 November 201422

Page 23: HPC Resource Accounting

Ken Schumacher I HPC Resource Accounting: Progress Against Allocation—Lessons Learned 12 November 2014

Part II - Allocation and Usage by Project (Weekly)

• The left side of the weekly summary report has the summary usage for just this week but across all the sub clusters.

23

Page 24: HPC Resource Accounting

Ken Schumacher I HPC Resource Accounting: Progress Against Allocation—Lessons Learned 12 November 2014

Part II - Allocation and Usage by Project (Allocation)

• The middle part of the weekly summary report has the allocation granted and used PYTD by project.

24

Page 25: HPC Resource Accounting

Ken Schumacher I HPC Resource Accounting: Progress Against Allocation—Lessons Learned 12 November 2014

Part II - Allocation and Usage by Project (PYTD)

• The right side of the weekly summary report has the summary usage for the Program YTD across all the sub clusters.

25

Page 26: HPC Resource Accounting

Ken Schumacher I HPC Resource Accounting: Progress Against Allocation—Lessons Learned 12 November 2014

Lesson 2 - Adjustments to Batch Priorities (lower)

• Project charmonium has just crossed over allocation.!- I will go into the configuration files of the batch scheduler and

change the priority for this one project!- The new priority is set to a negative number!- This causes any jobs that this project puts into the queue to wait

until the jobs of those projects who still have allocation remaining are allowed to run!

• A sub-cluster that is not currently billable is configured so that all projects, regardless of their allocation, run at equal priority.!• Opportunistic running!- If there are nodes available, we allow these over allocation

projects or a project that simply does not have an allocation to run on what would otherwise be idle nodes.

26

Page 27: HPC Resource Accounting

Ken Schumacher I HPC Resource Accounting: Progress Against Allocation—Lessons Learned 12 November 2014

Lesson 2 - Adjustments to Batch Priorities (increase)

• There are occasions where we may increase the priority for one or more projects.!- To meet a deadline for a paper!- To generate simulation data that is needed as inputs to upcoming

simulations that will run!• In some cases we may dedicate some number of nodes for a

specific project.

27

Page 28: HPC Resource Accounting

Ken Schumacher I HPC Resource Accounting: Progress Against Allocation—Lessons Learned 12 November 201428

Page 29: HPC Resource Accounting

Ken Schumacher I HPC Resource Accounting: Progress Against Allocation—Lessons Learned 12 November 2014

Part III - Totals and Progress Against Allocation (Weekly)

• The left side of the weekly summary report has the summary usage for just this week but across all the sub clusters.

29

Page 30: HPC Resource Accounting

Ken Schumacher I HPC Resource Accounting: Progress Against Allocation—Lessons Learned 12 November 2014

Part III - Totals and Progress Against Allocation (Allocation)

• The middle part of the weekly summary report has the allocation granted and used PYTD by project.

30

Page 31: HPC Resource Accounting

Ken Schumacher I HPC Resource Accounting: Progress Against Allocation—Lessons Learned 12 November 2014

Part III - Totals and Progress Against Allocation (PYTD)

• The right side of the weekly summary report has the summary usage for the Program YTD across all the sub clusters.

31

Page 32: HPC Resource Accounting

Ken Schumacher I HPC Resource Accounting: Progress Against Allocation—Lessons Learned 12 November 201432

Page 33: HPC Resource Accounting

Ken Schumacher I HPC Resource Accounting: Progress Against Allocation—Lessons Learned 12 November 2014

Lesson 3 - Credits for failed jobs• Occasionally a job will fail "softly". !• It does not crash and it reports a successful completion. So it is

billed against an allocation.!• When a soft failure is discovered, we will manually calculate a

credit to the project to reimburse the previous charges

33

Page 34: HPC Resource Accounting

Ken Schumacher I HPC Resource Accounting: Progress Against Allocation—Lessons Learned 12 November 2014

Lesson 4 - Credits for reduced performance• We have had occasions during the summer where our cooling

equipment could not keep up with the heat being generated!• Our Facilities group will notify us of a potential load-shed event!• During a load-shed event, some number of the nodes in our

clusters are simply turned off.!• The nodes that remain on-line have clock speeds reduced and

they run at a decreased load and wall-time limits are increased!• All jobs during a load shed get a credit for extra time used.!!

• Sorry, no sample. We have been able to avoid load shed events since July 1.

34

Page 35: HPC Resource Accounting

Ken Schumacher I HPC Resource Accounting: Progress Against Allocation—Lessons Learned 12 November 2014

Lesson 5 - Usefulness of burn rates

• The burn rates allow us to notify a PI that the project may be using its allocation at a rate that is too high or low.

35

Page 36: HPC Resource Accounting

Ken Schumacher I HPC Resource Accounting: Progress Against Allocation—Lessons Learned 12 November 2014

Summary of Lessons Learned• Normalized core-hours !- Using a Standard Unit Across Different Facilities!- CPU vs GPU: These truly are "Apples vs Oranges"!- Charges for Storage!• Adjustments to Batch Priorities for Fair Share!- Reduced priority for over allocation or un-allocated!- Increased priority or dedicate nodes where needed!• Charges for dedicated nodes!• Credits for failed jobs and for load shed events!- Failed jobs that are not the user's fault!- Load shed events that are driven by "mother nature"!• Usefulness of Burn Rates

36

Page 37: HPC Resource Accounting

Ken Schumacher I HPC Resource Accounting: Progress Against Allocation—Lessons Learned 12 November 2014

And I appreciate your patience with my hearing-loss. Please step to a microphone. Feel free to find me in the Hallway Track. I am here all week.

Time for your questions.

37

Page 38: HPC Resource Accounting

Ken Schumacher I HPC Resource Accounting: Progress Against Allocation—Lessons Learned 12 November 201438