upcoming biowulfseminars - nih hpc › training › handouts › effective_batch...upcoming...

31
Upcoming Biowulf Seminars November 30, 1 - 3 pm Python in HPC Overview of python tools used in high performance computing, and how to improve the performance of your python jobs on Biowulf Jan 16, 1 - 3 pm Relion tips and tricks, and Parallel jobs and benchmarking Mechanics and best practices for submiting RELION jobs to the batch system from both the command line and via the RELION GUI, as well as methods for monitoring and evaluating the results. Scaling of parallel jobs, how to benchmark to make effective use of your allocated resources Bldg 50, Rm 1227

Upload: others

Post on 05-Jul-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Upcoming BiowulfSeminars - NIH HPC › training › handouts › Effective_batch...Upcoming BiowulfSeminars •November 30, 1 -3 pm Python in HPC Overview of python tools used in high

UpcomingBiowulf Seminars

• November30,1- 3pmPythoninHPCOverviewofpythontoolsusedinhighperformancecomputing,andhowtoimprovetheperformanceofyourpythonjobsonBiowulf• Jan16,1- 3pmRelion tipsandtricks,andParalleljobsandbenchmarkingMechanicsandbestpracticesforsubmiting RELIONjobstothebatchsystemfromboththecommandlineandviatheRELIONGUI,aswellasmethodsformonitoringandevaluatingtheresults.Scalingofparalleljobs,howtobenchmarktomakeeffectiveuseofyourallocatedresources

Bldg 50,Rm1227

Page 2: Upcoming BiowulfSeminars - NIH HPC › training › handouts › Effective_batch...Upcoming BiowulfSeminars •November 30, 1 -3 pm Python in HPC Overview of python tools used in high

MakingEffectiveUseoftheBiowulf BatchSystem

NIHHPCSystems

[email protected],CIT

Oct30,2017

Page 3: Upcoming BiowulfSeminars - NIH HPC › training › handouts › Effective_batch...Upcoming BiowulfSeminars •November 30, 1 -3 pm Python in HPC Overview of python tools used in high

EffectiveUse==EffectiveResourceAllocation

• Specifyingresources• Estimatingrequiredresources• Allocatingresourceswithsbatch andswarm

• Monitoringresourceallocation• Schedulingandresourceallocation• Post-mortemanalysis

Page 4: Upcoming BiowulfSeminars - NIH HPC › training › handouts › Effective_batch...Upcoming BiowulfSeminars •November 30, 1 -3 pm Python in HPC Overview of python tools used in high

CPUs

CPU

HardwareTerminologyReview

Hyper-threading

Processor

Node

Page 5: Upcoming BiowulfSeminars - NIH HPC › training › handouts › Effective_batch...Upcoming BiowulfSeminars •November 30, 1 -3 pm Python in HPC Overview of python tools used in high

EstimatingResources• CPU• Checkdocumentation(https://hpc.nih.gov/apps/)• Objective-- matchCPU:Threads 1:1(thereareexceptions,e.g.,MDjobs)

• Memory• Runajoborswarmwithalargememoryallocation• Checkactualmemoryusage• Add10%toactualmemoryusage

• Time• Runajoborswarmwithalargetimeallocation• Checkactualwalltime• Add10%toactualwalltime

Page 6: Upcoming BiowulfSeminars - NIH HPC › training › handouts › Effective_batch...Upcoming BiowulfSeminars •November 30, 1 -3 pm Python in HPC Overview of python tools used in high

AllocatingResourceswithsbatch andswarm• Alljobs• --mem(sbatch)or-g(swarm)• --time(sbatch andswarm)• -btobundlecommandlines(swarm)

• Single-threadedjobs• “-p2”toloadcoreswith2threads(swarm)

• Multi-nodejobs• “ParallelJobsandBenchmarking”Jan16

• Multi-threadedjobs• --cpus-per-task(sbatch)or-t(swarm)• Use$SLURM_CPUS_PER_TASKinbatchscript• OMP_NUM_THREADS

Page 7: Upcoming BiowulfSeminars - NIH HPC › training › handouts › Effective_batch...Upcoming BiowulfSeminars •November 30, 1 -3 pm Python in HPC Overview of python tools used in high

MonitoringResourceAllocation

• CPU• jobload whilethejobisrunning• Dashboardduringorafterthejob• (NoeasywaytomonitorGPUutilizationatthemoment)

• Walltime• jobhist,Dashboardorsacct duringorafterthejobhascompleted

• Memory• jobload whilethejobisrunning• jobhist,Dashboardorsacct duringorafterthejobhascompleted

Page 8: Upcoming BiowulfSeminars - NIH HPC › training › handouts › Effective_batch...Upcoming BiowulfSeminars •November 30, 1 -3 pm Python in HPC Overview of python tools used in high

jobload% jobload -u someuserJOBID TIME NODES CPUS THREADS LOAD MEMORY

Elapsed / Wall Alloc Active Used / Alloc51863534 6-22:08:01 / 10-00:00:00 cn3095 4 4 100% 1.0 / 8.0 GB51863535 6-22:08:01 / 10-00:00:00 cn3256 4 5 125% 0.9 / 8.0 GB51863536 6-22:08:01 / 10-00:00:00 cn3348 4 1 25% 1.0 / 8.0 GB51863537 6-22:08:01 / 10-00:00:00 cn3401 4 3 75% 0.9 / 8.0 GB 51881591 6-19:42:16 / 10-00:00:00 cn3097 4 1 25% 1.0 / 8.0 GB

% jobload -j 51874438_233 JOBID TIME NODES CPUS THREADS LOAD MEMORY

Elapsed/Wall Alloc Active Used/Alloc51874438_233 6-20:10:13/10-00:00:00 cn3105 2 1 50% 0.5/ 1.5 GB

Page 9: Upcoming BiowulfSeminars - NIH HPC › training › handouts › Effective_batch...Upcoming BiowulfSeminars •November 30, 1 -3 pm Python in HPC Overview of python tools used in high

jobhist

# jobhist 52102264_67Jobid Partition State Nodes CPUs Walltime Runtime MemReq MemUsed Nodelist52102264_67 norm COMPLETED 1 2 02:00:00 00:05:29 4.0GB/node 0.8GB cn3185

allocated

used

Page 10: Upcoming BiowulfSeminars - NIH HPC › training › handouts › Effective_batch...Upcoming BiowulfSeminars •November 30, 1 -3 pm Python in HPC Overview of python tools used in high

sacct

% sacct --format=Jobname,AllocCPUS,AllocNodes,ReqMem,MaxRSS,Elapsed -j 52102332 JobName AllocCPUS AllocNodes ReqMem MaxRSS Elapsed ---------- ---------- ---------- ---------- ---------- ----------tbss_2_reg 2 1 4Gn 00:05:29 batch 2 1 4Gn 815152K 00:05:29

Page 11: Upcoming BiowulfSeminars - NIH HPC › training › handouts › Effective_batch...Upcoming BiowulfSeminars •November 30, 1 -3 pm Python in HPC Overview of python tools used in high

UsingYourDashboardtoMonitorJobshttps://hpc.nih.gov

https://hpc.nih.gov/dashboard/

Page 12: Upcoming BiowulfSeminars - NIH HPC › training › handouts › Effective_batch...Upcoming BiowulfSeminars •November 30, 1 -3 pm Python in HPC Overview of python tools used in high

SchedulingandResourceAllocation

• Schedulingisdeterminedbyjobpriority• PriorityisdeterminedbyFairshare valueofuser• Fairshare isdeterminedbyrecentcpu andmemoryallocations ofrunningjobs

• Unnecessarilylongtimeallocationwillpreventjobsfrombeingbackfilled

Page 13: Upcoming BiowulfSeminars - NIH HPC › training › handouts › Effective_batch...Upcoming BiowulfSeminars •November 30, 1 -3 pm Python in HPC Overview of python tools used in high

• ‘freen’showsfreeCPUsbutnotfreememoryordisk• Otherjobshavehigherpriority(sprio)• Nodesarereservedforhigher-priorityjobs

Whyaremyjobspending?

Page 14: Upcoming BiowulfSeminars - NIH HPC › training › handouts › Effective_batch...Upcoming BiowulfSeminars •November 30, 1 -3 pm Python in HPC Overview of python tools used in high

Consequencesof…

Specifyingmoreresourcesthanneeded

Specifyingfewerresourcesthanneeded

CPU WastedCPUresources,possiblyunnecessaryschedulingdelays

Jobrunsalittle/a lotslower

Memory Wastedmemoryresources,possiblyunnecessaryschedulingdelays

Jobis“Killed” bythekernel

Time Possiblyunnecessaryschedulingdelays

Jobiskilledbythebatch system

Page 15: Upcoming BiowulfSeminars - NIH HPC › training › handouts › Effective_batch...Upcoming BiowulfSeminars •November 30, 1 -3 pm Python in HPC Overview of python tools used in high

Post-mortemofjobsusinguserDashboardOr

TheGood,theBad,

andtheUgly…

Page 16: Upcoming BiowulfSeminars - NIH HPC › training › handouts › Effective_batch...Upcoming BiowulfSeminars •November 30, 1 -3 pm Python in HPC Overview of python tools used in high

Comment:jobisrunningwithdefaultallocationsforCPUandmemoryRecommendation:ifasubjob ofalargeswarm,try“-p2”

Page 17: Upcoming BiowulfSeminars - NIH HPC › training › handouts › Effective_batch...Upcoming BiowulfSeminars •November 30, 1 -3 pm Python in HPC Overview of python tools used in high

A+

Page 18: Upcoming BiowulfSeminars - NIH HPC › training › handouts › Effective_batch...Upcoming BiowulfSeminars •November 30, 1 -3 pm Python in HPC Overview of python tools used in high

AnotherA+

Page 19: Upcoming BiowulfSeminars - NIH HPC › training › handouts › Effective_batch...Upcoming BiowulfSeminars •November 30, 1 -3 pm Python in HPC Overview of python tools used in high

Recommendation:reduceCPUallocation

Page 20: Upcoming BiowulfSeminars - NIH HPC › training › handouts › Effective_batch...Upcoming BiowulfSeminars •November 30, 1 -3 pm Python in HPC Overview of python tools used in high

A+

Page 21: Upcoming BiowulfSeminars - NIH HPC › training › handouts › Effective_batch...Upcoming BiowulfSeminars •November 30, 1 -3 pm Python in HPC Overview of python tools used in high

Comment:perfectCPUutilization;underutilizedmemorybutentirenodeisallocatedduetocpu allocation

Page 22: Upcoming BiowulfSeminars - NIH HPC › training › handouts › Effective_batch...Upcoming BiowulfSeminars •November 30, 1 -3 pm Python in HPC Overview of python tools used in high

Comment:goodoverallutilization;possiblysplitintotwojobswithadependency,andwithdifferingresourceallocations

Page 23: Upcoming BiowulfSeminars - NIH HPC › training › handouts › Effective_batch...Upcoming BiowulfSeminars •November 30, 1 -3 pm Python in HPC Overview of python tools used in high

Comment:8CPUstoolittle/toomuch,2woulddoRecommendation:couldruninhalfthememory

Page 24: Upcoming BiowulfSeminars - NIH HPC › training › handouts › Effective_batch...Upcoming BiowulfSeminars •November 30, 1 -3 pm Python in HPC Overview of python tools used in high
Page 25: Upcoming BiowulfSeminars - NIH HPC › training › handouts › Effective_batch...Upcoming BiowulfSeminars •November 30, 1 -3 pm Python in HPC Overview of python tools used in high

Comment:goodmemoryutilizationRecommendation:increasingCPUallocationprobablywon’thelp

Page 26: Upcoming BiowulfSeminars - NIH HPC › training › handouts › Effective_batch...Upcoming BiowulfSeminars •November 30, 1 -3 pm Python in HPC Overview of python tools used in high

Comment:CPUsbadlyoverloadedRecommendation:couldruninlessthanhalfthememory

Page 27: Upcoming BiowulfSeminars - NIH HPC › training › handouts › Effective_batch...Upcoming BiowulfSeminars •November 30, 1 -3 pm Python in HPC Overview of python tools used in high

Comment:CPUsoverloaded200%Recommendation:256GBmemoryallocated,MBsused

Page 28: Upcoming BiowulfSeminars - NIH HPC › training › handouts › Effective_batch...Upcoming BiowulfSeminars •November 30, 1 -3 pm Python in HPC Overview of python tools used in high

Comment:goodmemoryutilizationRecommendation:mightrunfasterwith32CPUs?

Page 29: Upcoming BiowulfSeminars - NIH HPC › training › handouts › Effective_batch...Upcoming BiowulfSeminars •November 30, 1 -3 pm Python in HPC Overview of python tools used in high

Recommendation:increasingCPUcountmightimprove?

Page 30: Upcoming BiowulfSeminars - NIH HPC › training › handouts › Effective_batch...Upcoming BiowulfSeminars •November 30, 1 -3 pm Python in HPC Overview of python tools used in high

Recommendation:allocating56CPUswouldlikelyhelp