![Page 1: HTCondor User Tutorial - University of Wisconsin–Madisonpages.stat.wisc.edu/~jgillett/679/CHTC/ChristinaKochCHTC2nov17.pdf · 1 HIGH-THROUGHPUT COMPUTING AND YOUR RESEARCH Christina](https://reader031.vdocuments.site/reader031/viewer/2022030410/5a9864bb7f8b9a9c5b8d648c/html5/thumbnails/1.jpg)
1
HIGH-THROUGHPUT COMPUTINGAND YOUR RESEARCH
Christina Koch, Research Computing FacilitatorCenter for High Throughput Computing
STAT679, November 2, 2017
![Page 2: HTCondor User Tutorial - University of Wisconsin–Madisonpages.stat.wisc.edu/~jgillett/679/CHTC/ChristinaKochCHTC2nov17.pdf · 1 HIGH-THROUGHPUT COMPUTING AND YOUR RESEARCH Christina](https://reader031.vdocuments.site/reader031/viewer/2022030410/5a9864bb7f8b9a9c5b8d648c/html5/thumbnails/2.jpg)
2
About Me
• I work for the Center for High Throughput Computing (CHTC)
• We provide resources for UW Madison faculty, students and staff whose research problems are too big for their laptops/desktops
![Page 3: HTCondor User Tutorial - University of Wisconsin–Madisonpages.stat.wisc.edu/~jgillett/679/CHTC/ChristinaKochCHTC2nov17.pdf · 1 HIGH-THROUGHPUT COMPUTING AND YOUR RESEARCH Christina](https://reader031.vdocuments.site/reader031/viewer/2022030410/5a9864bb7f8b9a9c5b8d648c/html5/thumbnails/3.jpg)
3
Overview• Large Scale Computing• High Throughput Computing• Large Scale Computing Resources and the Center for High Throughput Computing (CHTC)
• Running HTC Jobs at CHTC• User Expectations• Next Steps
![Page 4: HTCondor User Tutorial - University of Wisconsin–Madisonpages.stat.wisc.edu/~jgillett/679/CHTC/ChristinaKochCHTC2nov17.pdf · 1 HIGH-THROUGHPUT COMPUTING AND YOUR RESEARCH Christina](https://reader031.vdocuments.site/reader031/viewer/2022030410/5a9864bb7f8b9a9c5b8d648c/html5/thumbnails/4.jpg)
4
LARGE SCALE COMPUTING
![Page 5: HTCondor User Tutorial - University of Wisconsin–Madisonpages.stat.wisc.edu/~jgillett/679/CHTC/ChristinaKochCHTC2nov17.pdf · 1 HIGH-THROUGHPUT COMPUTING AND YOUR RESEARCH Christina](https://reader031.vdocuments.site/reader031/viewer/2022030410/5a9864bb7f8b9a9c5b8d648c/html5/thumbnails/5.jpg)
5
What is large-scale computing?
larger than ‘desktop’(in memory, data, processors)
![Page 6: HTCondor User Tutorial - University of Wisconsin–Madisonpages.stat.wisc.edu/~jgillett/679/CHTC/ChristinaKochCHTC2nov17.pdf · 1 HIGH-THROUGHPUT COMPUTING AND YOUR RESEARCH Christina](https://reader031.vdocuments.site/reader031/viewer/2022030410/5a9864bb7f8b9a9c5b8d648c/html5/thumbnails/6.jpg)
6
Problem:running many computationstakes a long time
(running on one processor)
time
![Page 7: HTCondor User Tutorial - University of Wisconsin–Madisonpages.stat.wisc.edu/~jgillett/679/CHTC/ChristinaKochCHTC2nov17.pdf · 1 HIGH-THROUGHPUT COMPUTING AND YOUR RESEARCH Christina](https://reader031.vdocuments.site/reader031/viewer/2022030410/5a9864bb7f8b9a9c5b8d648c/html5/thumbnails/7.jpg)
7
time
So how do you speed things up?
![Page 8: HTCondor User Tutorial - University of Wisconsin–Madisonpages.stat.wisc.edu/~jgillett/679/CHTC/ChristinaKochCHTC2nov17.pdf · 1 HIGH-THROUGHPUT COMPUTING AND YOUR RESEARCH Christina](https://reader031.vdocuments.site/reader031/viewer/2022030410/5a9864bb7f8b9a9c5b8d648c/html5/thumbnails/8.jpg)
8
time
Break up the work!Use more processors! (parallelize)
n processors
![Page 9: HTCondor User Tutorial - University of Wisconsin–Madisonpages.stat.wisc.edu/~jgillett/679/CHTC/ChristinaKochCHTC2nov17.pdf · 1 HIGH-THROUGHPUT COMPUTING AND YOUR RESEARCH Christina](https://reader031.vdocuments.site/reader031/viewer/2022030410/5a9864bb7f8b9a9c5b8d648c/html5/thumbnails/9.jpg)
9
time
n processors
High throughput computing
![Page 10: HTCondor User Tutorial - University of Wisconsin–Madisonpages.stat.wisc.edu/~jgillett/679/CHTC/ChristinaKochCHTC2nov17.pdf · 1 HIGH-THROUGHPUT COMPUTING AND YOUR RESEARCH Christina](https://reader031.vdocuments.site/reader031/viewer/2022030410/5a9864bb7f8b9a9c5b8d648c/html5/thumbnails/10.jpg)
10
time
n processors
High performance computing
![Page 11: HTCondor User Tutorial - University of Wisconsin–Madisonpages.stat.wisc.edu/~jgillett/679/CHTC/ChristinaKochCHTC2nov17.pdf · 1 HIGH-THROUGHPUT COMPUTING AND YOUR RESEARCH Christina](https://reader031.vdocuments.site/reader031/viewer/2022030410/5a9864bb7f8b9a9c5b8d648c/html5/thumbnails/11.jpg)
11
HIGH THROUGHPUT COMPUTING
![Page 12: HTCondor User Tutorial - University of Wisconsin–Madisonpages.stat.wisc.edu/~jgillett/679/CHTC/ChristinaKochCHTC2nov17.pdf · 1 HIGH-THROUGHPUT COMPUTING AND YOUR RESEARCH Christina](https://reader031.vdocuments.site/reader031/viewer/2022030410/5a9864bb7f8b9a9c5b8d648c/html5/thumbnails/12.jpg)
12
High Throughput Examples• Test many parameter combinations• Analyze multiple images or datasets• Do a replicate/randomized analysis• Align genome/RNA sequence data
![Page 13: HTCondor User Tutorial - University of Wisconsin–Madisonpages.stat.wisc.edu/~jgillett/679/CHTC/ChristinaKochCHTC2nov17.pdf · 1 HIGH-THROUGHPUT COMPUTING AND YOUR RESEARCH Christina](https://reader031.vdocuments.site/reader031/viewer/2022030410/5a9864bb7f8b9a9c5b8d648c/html5/thumbnails/13.jpg)
13
Explaining Post-Katrina Home Rebuilding
Economics professor, Jesse Gregory, performs HTC optimization of a model to predict the most important factors determining household rebuilding after Katrina.
Most important rebuilding factors:- relative funding available to
household if rebuilt- rebuild status of neighboring
households
http://www.opensciencegrid.org/using-high-throughput-computing-to-evaluate-post-katrina-rebuilding-grants/
Jesse’s projects in the last year:4.5 million hours, 1.5 million OSG hours
Fraction of Neighbors Rebuilt
(Repair Cost / Replacement Cost)
more fundsqualified for
less fundsqualified for
![Page 14: HTCondor User Tutorial - University of Wisconsin–Madisonpages.stat.wisc.edu/~jgillett/679/CHTC/ChristinaKochCHTC2nov17.pdf · 1 HIGH-THROUGHPUT COMPUTING AND YOUR RESEARCH Christina](https://reader031.vdocuments.site/reader031/viewer/2022030410/5a9864bb7f8b9a9c5b8d648c/html5/thumbnails/14.jpg)
14
The Philosophy of HTC
• Break work into many ‘smaller’ jobs• single or few CPUs, short run times, smaller input and output per job
• Run on as many processors as possible• smaller and shorter jobs are best• take dependencies with you (like R)
• Automate as much as you can• black box programs that use various input files• numbered files
• Scale up gradually
14
![Page 15: HTCondor User Tutorial - University of Wisconsin–Madisonpages.stat.wisc.edu/~jgillett/679/CHTC/ChristinaKochCHTC2nov17.pdf · 1 HIGH-THROUGHPUT COMPUTING AND YOUR RESEARCH Christina](https://reader031.vdocuments.site/reader031/viewer/2022030410/5a9864bb7f8b9a9c5b8d648c/html5/thumbnails/15.jpg)
15
CENTER FOR HIGH THROUGHPUT COMPUTING
![Page 16: HTCondor User Tutorial - University of Wisconsin–Madisonpages.stat.wisc.edu/~jgillett/679/CHTC/ChristinaKochCHTC2nov17.pdf · 1 HIGH-THROUGHPUT COMPUTING AND YOUR RESEARCH Christina](https://reader031.vdocuments.site/reader031/viewer/2022030410/5a9864bb7f8b9a9c5b8d648c/html5/thumbnails/16.jpg)
16
time
n processors
High throughput computing
![Page 17: HTCondor User Tutorial - University of Wisconsin–Madisonpages.stat.wisc.edu/~jgillett/679/CHTC/ChristinaKochCHTC2nov17.pdf · 1 HIGH-THROUGHPUT COMPUTING AND YOUR RESEARCH Christina](https://reader031.vdocuments.site/reader031/viewer/2022030410/5a9864bb7f8b9a9c5b8d648c/html5/thumbnails/17.jpg)
17
WHAT WE NEEDLots of computers, to run multiple independent computations
![Page 18: HTCondor User Tutorial - University of Wisconsin–Madisonpages.stat.wisc.edu/~jgillett/679/CHTC/ChristinaKochCHTC2nov17.pdf · 1 HIGH-THROUGHPUT COMPUTING AND YOUR RESEARCH Christina](https://reader031.vdocuments.site/reader031/viewer/2022030410/5a9864bb7f8b9a9c5b8d648c/html5/thumbnails/18.jpg)
18
CHTC Services
Center for High Throughput Computing, est. 2006
• Large-scale, campus-shared computing systems• high-throughput computing (HTC) and
high-performance computing (HPC) systems
• all standard services provided free-of-charge
• hardware buy-in options • support and training for
using our systems• proposal assistance• chtc.cs.wisc.edu
![Page 19: HTCondor User Tutorial - University of Wisconsin–Madisonpages.stat.wisc.edu/~jgillett/679/CHTC/ChristinaKochCHTC2nov17.pdf · 1 HIGH-THROUGHPUT COMPUTING AND YOUR RESEARCH Christina](https://reader031.vdocuments.site/reader031/viewer/2022030410/5a9864bb7f8b9a9c5b8d648c/html5/thumbnails/19.jpg)
19
CHTC
time limit: <72 hrs/job
10,000+ CPU hrs/day
S
CHTC Accessible Computing
![Page 20: HTCondor User Tutorial - University of Wisconsin–Madisonpages.stat.wisc.edu/~jgillett/679/CHTC/ChristinaKochCHTC2nov17.pdf · 1 HIGH-THROUGHPUT COMPUTING AND YOUR RESEARCH Christina](https://reader031.vdocuments.site/reader031/viewer/2022030410/5a9864bb7f8b9a9c5b8d648c/html5/thumbnails/20.jpg)
20
UW Gridup to ~8 hrs/job
~20,000 CPU hrs/day
CHTC<72 hrs/job
10,000 hrs/day
S
CHTC Accessible Computing
![Page 21: HTCondor User Tutorial - University of Wisconsin–Madisonpages.stat.wisc.edu/~jgillett/679/CHTC/ChristinaKochCHTC2nov17.pdf · 1 HIGH-THROUGHPUT COMPUTING AND YOUR RESEARCH Christina](https://reader031.vdocuments.site/reader031/viewer/2022030410/5a9864bb7f8b9a9c5b8d648c/html5/thumbnails/21.jpg)
21
Open Science Gridup to ~4 hrs/job
~200,000 CPU hrs/day
UW Gridup to ~8 hrs/job
~20,000 CPU hrs/day
CHTC<72 hrs/job
10,000 hrs/day
S
CHTC Accessible Computing
![Page 22: HTCondor User Tutorial - University of Wisconsin–Madisonpages.stat.wisc.edu/~jgillett/679/CHTC/ChristinaKochCHTC2nov17.pdf · 1 HIGH-THROUGHPUT COMPUTING AND YOUR RESEARCH Christina](https://reader031.vdocuments.site/reader031/viewer/2022030410/5a9864bb7f8b9a9c5b8d648c/html5/thumbnails/22.jpg)
22Researchers who use the CHTC are located all over campus (red buildings)
http://chtc.cs.wisc.edu
Jul’12-Jun’13
Jul’13-Jun’14
Jul’14-Jun’15 Quick Facts
97 132 265 Million Hours Served120 148 188 Research Projects52 56 61 Departments
![Page 23: HTCondor User Tutorial - University of Wisconsin–Madisonpages.stat.wisc.edu/~jgillett/679/CHTC/ChristinaKochCHTC2nov17.pdf · 1 HIGH-THROUGHPUT COMPUTING AND YOUR RESEARCH Christina](https://reader031.vdocuments.site/reader031/viewer/2022030410/5a9864bb7f8b9a9c5b8d648c/html5/thumbnails/23.jpg)
23
Jul’12-Jun’13
Jul’13-Jun’14
Jul’14-Jun’15 Quick Facts
97 132 265 Million Hours Served120 148 188 Research Projects52 56 61 Departments
Researchers who use the CHTC are located all over campus (red buildings)
http://chtc.cs.wisc.edu
Individual researchers:
30 years of computing per day
![Page 24: HTCondor User Tutorial - University of Wisconsin–Madisonpages.stat.wisc.edu/~jgillett/679/CHTC/ChristinaKochCHTC2nov17.pdf · 1 HIGH-THROUGHPUT COMPUTING AND YOUR RESEARCH Christina](https://reader031.vdocuments.site/reader031/viewer/2022030410/5a9864bb7f8b9a9c5b8d648c/html5/thumbnails/24.jpg)
24
What else?• Software
• We can support most open source software (R, Python, other projects)
• Can also support Matlab• Data
• CHTC cannot *store* data• However, you can work with up to several TB of data on our system
![Page 25: HTCondor User Tutorial - University of Wisconsin–Madisonpages.stat.wisc.edu/~jgillett/679/CHTC/ChristinaKochCHTC2nov17.pdf · 1 HIGH-THROUGHPUT COMPUTING AND YOUR RESEARCH Christina](https://reader031.vdocuments.site/reader031/viewer/2022030410/5a9864bb7f8b9a9c5b8d648c/html5/thumbnails/25.jpg)
25
How it works• Job submission
• Instead of running programs on your own computer, log in and submit jobs
• Job = single independent calculation• Can run hundreds of jobs at once (or more)
• HTCondor• Computer program that controls and runs jobs on CHTC’s computers
![Page 26: HTCondor User Tutorial - University of Wisconsin–Madisonpages.stat.wisc.edu/~jgillett/679/CHTC/ChristinaKochCHTC2nov17.pdf · 1 HIGH-THROUGHPUT COMPUTING AND YOUR RESEARCH Christina](https://reader031.vdocuments.site/reader031/viewer/2022030410/5a9864bb7f8b9a9c5b8d648c/html5/thumbnails/26.jpg)
26
Getting Started• Facilitators
• Help researchers get started• Advise on best approach and resources for your research problem
• How do I get an account?• Temporary accounts created for this class. • If you want to use CHTC beyond this class for research, fill out our account request form: http://chtc.cs.wisc.edu/form
![Page 27: HTCondor User Tutorial - University of Wisconsin–Madisonpages.stat.wisc.edu/~jgillett/679/CHTC/ChristinaKochCHTC2nov17.pdf · 1 HIGH-THROUGHPUT COMPUTING AND YOUR RESEARCH Christina](https://reader031.vdocuments.site/reader031/viewer/2022030410/5a9864bb7f8b9a9c5b8d648c/html5/thumbnails/27.jpg)
27
RUNNING A JOB ON CHTC’S HIGH THROUGHPUT SYSTEM WITH HTCONDOR
![Page 28: HTCondor User Tutorial - University of Wisconsin–Madisonpages.stat.wisc.edu/~jgillett/679/CHTC/ChristinaKochCHTC2nov17.pdf · 1 HIGH-THROUGHPUT COMPUTING AND YOUR RESEARCH Christina](https://reader031.vdocuments.site/reader031/viewer/2022030410/5a9864bb7f8b9a9c5b8d648c/html5/thumbnails/28.jpg)
28
How It Works (in CHTC)• Submit jobs to a queue (on a submit server)• HTCondor schedules them to run on computers that belong to CHTC (execute servers)
submitexecute
execute
execute
![Page 29: HTCondor User Tutorial - University of Wisconsin–Madisonpages.stat.wisc.edu/~jgillett/679/CHTC/ChristinaKochCHTC2nov17.pdf · 1 HIGH-THROUGHPUT COMPUTING AND YOUR RESEARCH Christina](https://reader031.vdocuments.site/reader031/viewer/2022030410/5a9864bb7f8b9a9c5b8d648c/html5/thumbnails/29.jpg)
29
HTCONDOR
What is HTCondor?• Software that schedules and runs computing tasks on computers
![Page 30: HTCondor User Tutorial - University of Wisconsin–Madisonpages.stat.wisc.edu/~jgillett/679/CHTC/ChristinaKochCHTC2nov17.pdf · 1 HIGH-THROUGHPUT COMPUTING AND YOUR RESEARCH Christina](https://reader031.vdocuments.site/reader031/viewer/2022030410/5a9864bb7f8b9a9c5b8d648c/html5/thumbnails/30.jpg)
30
Job Example• Consider an imaginary program called “compare_states”, which compares two data files and produces a single output file.
wi.dat
compare_states
us.dat
wi.dat.out
$ compare_states wi.dat us.dat wi.dat.out
![Page 31: HTCondor User Tutorial - University of Wisconsin–Madisonpages.stat.wisc.edu/~jgillett/679/CHTC/ChristinaKochCHTC2nov17.pdf · 1 HIGH-THROUGHPUT COMPUTING AND YOUR RESEARCH Christina](https://reader031.vdocuments.site/reader031/viewer/2022030410/5a9864bb7f8b9a9c5b8d648c/html5/thumbnails/31.jpg)
31
Submit File• List your executable and any arguments it takes.
• Arguments are any options passed to the executable from the command line.
compare_states
$ compare_states wi.dat us.dat wi.dat.out
executable = compare_statesarguments = wi.dat us.dat wi.dat.out
should_transfer_files = YEStransfer_input_files = us.dat, wi.datwhen_to_transfer_output = ON_EXIT
log = job.logoutput = job.outerror = job.err
request_cpus = 1request_disk = 20MBrequest_memory = 20MB
queue 1
job.submit
![Page 32: HTCondor User Tutorial - University of Wisconsin–Madisonpages.stat.wisc.edu/~jgillett/679/CHTC/ChristinaKochCHTC2nov17.pdf · 1 HIGH-THROUGHPUT COMPUTING AND YOUR RESEARCH Christina](https://reader031.vdocuments.site/reader031/viewer/2022030410/5a9864bb7f8b9a9c5b8d648c/html5/thumbnails/32.jpg)
32
Submit File• Indicate your input files.
wi.dat
us.dat
executable = compare_statesarguments = wi.dat us.dat wi.dat.out
should_transfer_files = YEStransfer_input_files = us.dat, wi.datwhen_to_transfer_output = ON_EXIT
log = job.logoutput = job.outerror = job.err
request_cpus = 1request_disk = 20MBrequest_memory = 20MB
queue 1
job.submit
![Page 33: HTCondor User Tutorial - University of Wisconsin–Madisonpages.stat.wisc.edu/~jgillett/679/CHTC/ChristinaKochCHTC2nov17.pdf · 1 HIGH-THROUGHPUT COMPUTING AND YOUR RESEARCH Christina](https://reader031.vdocuments.site/reader031/viewer/2022030410/5a9864bb7f8b9a9c5b8d648c/html5/thumbnails/33.jpg)
33
Submit File• HTCondor will transfer back all new and changed files (usually output) from the job.
wi.dat.out
executable = compare_statesarguments = wi.dat us.dat wi.dat.out
should_transfer_files = YEStransfer_input_files = us.dat, wi.datwhen_to_transfer_output = ON_EXIT
log = job.logoutput = job.outerror = job.err
request_cpus = 1request_disk = 20MBrequest_memory = 20MB
queue 1
job.submit
![Page 34: HTCondor User Tutorial - University of Wisconsin–Madisonpages.stat.wisc.edu/~jgillett/679/CHTC/ChristinaKochCHTC2nov17.pdf · 1 HIGH-THROUGHPUT COMPUTING AND YOUR RESEARCH Christina](https://reader031.vdocuments.site/reader031/viewer/2022030410/5a9864bb7f8b9a9c5b8d648c/html5/thumbnails/34.jpg)
34
Submit File• log: file created by HTCondor to track job progress
• output/error: captures stdout and stderr
executable = compare_statesarguments = wi.dat us.dat wi.dat.out
should_transfer_files = YEStransfer_input_files = us.dat, wi.datwhen_to_transfer_output = ON_EXIT
log = job.logoutput = job.outerror = job.err
request_cpus = 1request_disk = 20MBrequest_memory = 20MB
queue 1
job.submit
![Page 35: HTCondor User Tutorial - University of Wisconsin–Madisonpages.stat.wisc.edu/~jgillett/679/CHTC/ChristinaKochCHTC2nov17.pdf · 1 HIGH-THROUGHPUT COMPUTING AND YOUR RESEARCH Christina](https://reader031.vdocuments.site/reader031/viewer/2022030410/5a9864bb7f8b9a9c5b8d648c/html5/thumbnails/35.jpg)
35
Submit File• Request the appropriate resources for your job to run.
•queue: keyword indicating “create a job.”
executable = compare_statesarguments = wi.dat us.dat wi.dat.out
should_transfer_files = YEStransfer_input_files = us.dat, wi.datwhen_to_transfer_output = ON_EXIT
log = job.logoutput = job.outerror = job.err
request_cpus = 1request_disk = 20MBrequest_memory = 20MB
queue 1
job.submit
![Page 36: HTCondor User Tutorial - University of Wisconsin–Madisonpages.stat.wisc.edu/~jgillett/679/CHTC/ChristinaKochCHTC2nov17.pdf · 1 HIGH-THROUGHPUT COMPUTING AND YOUR RESEARCH Christina](https://reader031.vdocuments.site/reader031/viewer/2022030410/5a9864bb7f8b9a9c5b8d648c/html5/thumbnails/36.jpg)
36
Submitting and Monitoring• To submit a job/jobs:
condor_submit submit_file_name
• To monitor submitted jobs, use: condor_q
$ condor_submit job.submitSubmitting job(s).1 job(s) submitted to cluster 128.
$ condor_q-- Schedd: submit-5.chtc.wisc.edu : <128.104.101.92:9618?...ID OWNER SUBMITTED RUN_TIME ST PRI SIZE CMD128.0 alice 5/9 11:09 0+00:00:00 I 0 0.0 compare_states wi.dat us.dat
1 jobs; 0 completed, 0 removed, 1 idle, 0 running, 0 held, 0 suspended
HTCondor Manual: condor_submitHTCondor Manual: condor_q
![Page 37: HTCondor User Tutorial - University of Wisconsin–Madisonpages.stat.wisc.edu/~jgillett/679/CHTC/ChristinaKochCHTC2nov17.pdf · 1 HIGH-THROUGHPUT COMPUTING AND YOUR RESEARCH Christina](https://reader031.vdocuments.site/reader031/viewer/2022030410/5a9864bb7f8b9a9c5b8d648c/html5/thumbnails/37.jpg)
37
condor_q
• By default condor_q shows user’s job only*
• Constrain with username, ClusterId or full JobId
$ condor_q-- Schedd: submit-5.chtc.wisc.edu : <128.104.101.92:9618?...ID OWNER SUBMITTED RUN_TIME ST PRI SIZE CMD
128.0 alice 5/9 11:09 0+00:00:00 I 0 0.0 compare_states wi.dat us.dat
1 jobs; 0 completed, 0 removed, 1 idle, 0 running, 0 held, 0 suspended
* as of version 8.5
JobId = ClusterId .ProcId
![Page 38: HTCondor User Tutorial - University of Wisconsin–Madisonpages.stat.wisc.edu/~jgillett/679/CHTC/ChristinaKochCHTC2nov17.pdf · 1 HIGH-THROUGHPUT COMPUTING AND YOUR RESEARCH Christina](https://reader031.vdocuments.site/reader031/viewer/2022030410/5a9864bb7f8b9a9c5b8d648c/html5/thumbnails/38.jpg)
38
Job Idle
(submit_dir)/job.submitcompare_stateswi.datus.datjob.logjob.outjob.err
$ condor_q-- Schedd: submit-5.chtc.wisc.edu : <128.104.101.92:9618?...ID OWNER SUBMITTED RUN_TIME ST PRI SIZE CMD
128.0 alice 5/9 11:09 0+00:00:00 I 0 0.0 compare_states wi.dat us.dat
1 jobs; 0 completed, 0 removed, 1 idle, 0 running, 0 held, 0 suspended
Submit Node
![Page 39: HTCondor User Tutorial - University of Wisconsin–Madisonpages.stat.wisc.edu/~jgillett/679/CHTC/ChristinaKochCHTC2nov17.pdf · 1 HIGH-THROUGHPUT COMPUTING AND YOUR RESEARCH Christina](https://reader031.vdocuments.site/reader031/viewer/2022030410/5a9864bb7f8b9a9c5b8d648c/html5/thumbnails/39.jpg)
39
Job Starts
compare_stateswi.datus.dat
$ condor_q-- Schedd: submit-5.chtc.wisc.edu : <128.104.101.92:9618?...ID OWNER SUBMITTED RUN_TIME ST PRI SIZE CMD
128.0 alice 5/9 11:09 0+00:00:00 < 0 0.0 compare_states wi.dat us.dat w
1 jobs; 0 completed, 0 removed, 0 idle, 1 running, 0 held, 0 suspended
(submit_dir)/job.submitcompare_stateswi.datus.datjob.logjob.outjob.err
Submit Node
(execute_dir)/
Execute Node
![Page 40: HTCondor User Tutorial - University of Wisconsin–Madisonpages.stat.wisc.edu/~jgillett/679/CHTC/ChristinaKochCHTC2nov17.pdf · 1 HIGH-THROUGHPUT COMPUTING AND YOUR RESEARCH Christina](https://reader031.vdocuments.site/reader031/viewer/2022030410/5a9864bb7f8b9a9c5b8d648c/html5/thumbnails/40.jpg)
40
Job Running$ condor_q
-- Schedd: submit-5.chtc.wisc.edu : <128.104.101.92:9618?...ID OWNER SUBMITTED RUN_TIME ST PRI SIZE CMD
128.0 alice 5/9 11:09 0+00:01:08 R 0 0.0 compare_states wi.dat us.dat
1 jobs; 0 completed, 0 removed, 0 idle, 1 running, 0 held, 0 suspended
(submit_dir)/job.submitcompare_stateswi.datus.datjob.logjob.outjob.err
Submit Node
(execute_dir)/compare_stateswi.datus.datstderrstdoutwi.dat.out
Execute Node
![Page 41: HTCondor User Tutorial - University of Wisconsin–Madisonpages.stat.wisc.edu/~jgillett/679/CHTC/ChristinaKochCHTC2nov17.pdf · 1 HIGH-THROUGHPUT COMPUTING AND YOUR RESEARCH Christina](https://reader031.vdocuments.site/reader031/viewer/2022030410/5a9864bb7f8b9a9c5b8d648c/html5/thumbnails/41.jpg)
41
Job Completes
(execute_dir)/compare_stateswi.datus.datstderrstdoutwi.dat.out
stderrstdout
wi.dat.out
$ condor_q-- Schedd: submit-5.chtc.wisc.edu : <128.104.101.92:9618?...ID OWNER SUBMITTED RUN_TIME ST PRI SIZE CMD
128 alice 5/9 11:09 0+00:02:02 > 0 0.0 compare_states wi.dat us.dat
1 jobs; 0 completed, 0 removed, 0 idle, 1 running, 0 held, 0 suspended
Execute Node
(submit_dir)/job.submitcompare_stateswi.datus.datjob.logjob.outjob.err
Submit Node
![Page 42: HTCondor User Tutorial - University of Wisconsin–Madisonpages.stat.wisc.edu/~jgillett/679/CHTC/ChristinaKochCHTC2nov17.pdf · 1 HIGH-THROUGHPUT COMPUTING AND YOUR RESEARCH Christina](https://reader031.vdocuments.site/reader031/viewer/2022030410/5a9864bb7f8b9a9c5b8d648c/html5/thumbnails/42.jpg)
42
Job Completes (cont.)$ condor_q
-- Schedd: submit-5.chtc.wisc.edu : <128.104.101.92:9618?...ID OWNER SUBMITTED RUN_TIME ST PRI SIZE CMD
0 jobs; 0 completed, 0 removed, 0 idle, 0 running, 0 held, 0 suspended
(submit_dir)/job.submitcompare_stateswi.datus.datjob.logjob.outjob.errwi.dat.out
Submit Node
![Page 43: HTCondor User Tutorial - University of Wisconsin–Madisonpages.stat.wisc.edu/~jgillett/679/CHTC/ChristinaKochCHTC2nov17.pdf · 1 HIGH-THROUGHPUT COMPUTING AND YOUR RESEARCH Christina](https://reader031.vdocuments.site/reader031/viewer/2022030410/5a9864bb7f8b9a9c5b8d648c/html5/thumbnails/43.jpg)
43
TRY IT OUT
![Page 44: HTCondor User Tutorial - University of Wisconsin–Madisonpages.stat.wisc.edu/~jgillett/679/CHTC/ChristinaKochCHTC2nov17.pdf · 1 HIGH-THROUGHPUT COMPUTING AND YOUR RESEARCH Christina](https://reader031.vdocuments.site/reader031/viewer/2022030410/5a9864bb7f8b9a9c5b8d648c/html5/thumbnails/44.jpg)
44
Setting up• Log In
• hostname: learn.chtc.wisc.edu• username: your campus NetId
• Copy a sample submit file and script to your home directory: • cp /home/groups/stat679/example/* ./
• Choose a name to pass as the argument, by adding it to the submit file.
![Page 45: HTCondor User Tutorial - University of Wisconsin–Madisonpages.stat.wisc.edu/~jgillett/679/CHTC/ChristinaKochCHTC2nov17.pdf · 1 HIGH-THROUGHPUT COMPUTING AND YOUR RESEARCH Christina](https://reader031.vdocuments.site/reader031/viewer/2022030410/5a9864bb7f8b9a9c5b8d648c/html5/thumbnails/45.jpg)
45
Submit the job• Submit the job and monitor it in the queue
• condor_submit job.sub• condor_q
• Check the .out and .log files once the job completes:• What does the message say?• How much memory/disk did your job use?
![Page 46: HTCondor User Tutorial - University of Wisconsin–Madisonpages.stat.wisc.edu/~jgillett/679/CHTC/ChristinaKochCHTC2nov17.pdf · 1 HIGH-THROUGHPUT COMPUTING AND YOUR RESEARCH Christina](https://reader031.vdocuments.site/reader031/viewer/2022030410/5a9864bb7f8b9a9c5b8d648c/html5/thumbnails/46.jpg)
46
Submitting Multiple JobsReplacing single job inputs
with a variable of choice
executable = compare_statesarguments = wi.dat us.dat wi.dat.out
transfer_input_files = us.dat, wi.dat
queue 1
executable = compare_statesarguments = $(infile) us.dat $(infile).out
transfer_input_files = us.dat, $(infile)
queue ...
![Page 47: HTCondor User Tutorial - University of Wisconsin–Madisonpages.stat.wisc.edu/~jgillett/679/CHTC/ChristinaKochCHTC2nov17.pdf · 1 HIGH-THROUGHPUT COMPUTING AND YOUR RESEARCH Christina](https://reader031.vdocuments.site/reader031/viewer/2022030410/5a9864bb7f8b9a9c5b8d648c/html5/thumbnails/47.jpg)
47
multiple “queue” statements
matching ... pattern
in ... list
from ... file
Possible Queue Statementsinfile = wi.datqueue 1infile = ca.datqueue 1infile = ia.datqueue 1
queue infile matching *.dat
queue infile in (wi.dat ca.dat ia.dat)
queue infile from state_list.txtwi.datca.datia.dat
Not Recommended
state_list.txt
Recommended
![Page 48: HTCondor User Tutorial - University of Wisconsin–Madisonpages.stat.wisc.edu/~jgillett/679/CHTC/ChristinaKochCHTC2nov17.pdf · 1 HIGH-THROUGHPUT COMPUTING AND YOUR RESEARCH Christina](https://reader031.vdocuments.site/reader031/viewer/2022030410/5a9864bb7f8b9a9c5b8d648c/html5/thumbnails/48.jpg)
48
Multiple Jobs• Create a list of names in a text file called “names.txt”
• Create a copy of the submit file called “multiple-jobs.sub”
• Change two lines in “multiple-jobs.sub”• arguments = $(name)• queue name from names.txt
• Submit the jobs to see what happens.
![Page 49: HTCondor User Tutorial - University of Wisconsin–Madisonpages.stat.wisc.edu/~jgillett/679/CHTC/ChristinaKochCHTC2nov17.pdf · 1 HIGH-THROUGHPUT COMPUTING AND YOUR RESEARCH Christina](https://reader031.vdocuments.site/reader031/viewer/2022030410/5a9864bb7f8b9a9c5b8d648c/html5/thumbnails/49.jpg)
49
USER EXPECTATIONS
![Page 50: HTCondor User Tutorial - University of Wisconsin–Madisonpages.stat.wisc.edu/~jgillett/679/CHTC/ChristinaKochCHTC2nov17.pdf · 1 HIGH-THROUGHPUT COMPUTING AND YOUR RESEARCH Christina](https://reader031.vdocuments.site/reader031/viewer/2022030410/5a9864bb7f8b9a9c5b8d648c/html5/thumbnails/50.jpg)
50
Be responsible! • These resources are shared and you get to use them for free -- be a good citizen.
• Ask questions if you aren’t sure about something.
• Don’t run programs directly on the submit server.
• Data files should be small. Talk to us if you want to submit jobs with big (> 1GB) of data.
![Page 51: HTCondor User Tutorial - University of Wisconsin–Madisonpages.stat.wisc.edu/~jgillett/679/CHTC/ChristinaKochCHTC2nov17.pdf · 1 HIGH-THROUGHPUT COMPUTING AND YOUR RESEARCH Christina](https://reader031.vdocuments.site/reader031/viewer/2022030410/5a9864bb7f8b9a9c5b8d648c/html5/thumbnails/51.jpg)
51
Resource Request• Jobs are nearly always using a part of a computer, not the whole thing
• Very important to request appropriate resources (memory, cpus, disk) for a job
whole computer
your request
![Page 52: HTCondor User Tutorial - University of Wisconsin–Madisonpages.stat.wisc.edu/~jgillett/679/CHTC/ChristinaKochCHTC2nov17.pdf · 1 HIGH-THROUGHPUT COMPUTING AND YOUR RESEARCH Christina](https://reader031.vdocuments.site/reader031/viewer/2022030410/5a9864bb7f8b9a9c5b8d648c/html5/thumbnails/52.jpg)
52
Resource Assumptions• Even if your system has default CPU, memory and disk requests, these may be too small!
• Important to run test jobs and use the log file to request the right amount of resources: • requesting too little: causes problems for your and other jobs; jobs might by held by HTCondor
• requesting too much: jobs will match to fewer “slots”
![Page 53: HTCondor User Tutorial - University of Wisconsin–Madisonpages.stat.wisc.edu/~jgillett/679/CHTC/ChristinaKochCHTC2nov17.pdf · 1 HIGH-THROUGHPUT COMPUTING AND YOUR RESEARCH Christina](https://reader031.vdocuments.site/reader031/viewer/2022030410/5a9864bb7f8b9a9c5b8d648c/html5/thumbnails/53.jpg)
53
Log File000 (128.000.000) 05/09 11:09:08 Job submitted from host: <128.104.101.92&sock=6423_b881_3>...001 (128.000.000) 05/09 11:10:46 Job executing on host: <128.104.101.128:9618&sock=5053_3126_3>...006 (128.000.000) 05/09 11:10:54 Image size of job updated: 220
1 - MemoryUsage of job (MB)220 - ResidentSetSize of job (KB)
...005 (128.000.000) 05/09 11:12:48 Job terminated.
(1) Normal termination (return value 0)Usr 0 00:00:00, Sys 0 00:00:00 - Run Remote UsageUsr 0 00:00:00, Sys 0 00:00:00 - Run Local UsageUsr 0 00:00:00, Sys 0 00:00:00 - Total Remote UsageUsr 0 00:00:00, Sys 0 00:00:00 - Total Local Usage
0 - Run Bytes Sent By Job33 - Run Bytes Received By Job0 - Total Bytes Sent By Job33 - Total Bytes Received By JobPartitionable Resources : Usage Request Allocated
Cpus : 1 1Disk (KB) : 14 20480 17203728Memory (MB) : 1 20 20
![Page 54: HTCondor User Tutorial - University of Wisconsin–Madisonpages.stat.wisc.edu/~jgillett/679/CHTC/ChristinaKochCHTC2nov17.pdf · 1 HIGH-THROUGHPUT COMPUTING AND YOUR RESEARCH Christina](https://reader031.vdocuments.site/reader031/viewer/2022030410/5a9864bb7f8b9a9c5b8d648c/html5/thumbnails/54.jpg)
54
TESTING IS KEY!ALWAYS run test jobs before submitting many jobs at once.
![Page 55: HTCondor User Tutorial - University of Wisconsin–Madisonpages.stat.wisc.edu/~jgillett/679/CHTC/ChristinaKochCHTC2nov17.pdf · 1 HIGH-THROUGHPUT COMPUTING AND YOUR RESEARCH Christina](https://reader031.vdocuments.site/reader031/viewer/2022030410/5a9864bb7f8b9a9c5b8d648c/html5/thumbnails/55.jpg)
55
Facilitators• Christina (me) and Lauren Michael work for CHTC as Research Computing Facilitators.
• It’s our job to answer questions and help people get started with computing at CHTC.
• Email us! [email protected]• Or come to office hours in the WID:
• Tues/Thurs, 3:00 - 4:30• Wed, 9:30 - 11:30
http://chtc.cs.wisc.edu/get-help
![Page 56: HTCondor User Tutorial - University of Wisconsin–Madisonpages.stat.wisc.edu/~jgillett/679/CHTC/ChristinaKochCHTC2nov17.pdf · 1 HIGH-THROUGHPUT COMPUTING AND YOUR RESEARCH Christina](https://reader031.vdocuments.site/reader031/viewer/2022030410/5a9864bb7f8b9a9c5b8d648c/html5/thumbnails/56.jpg)
56
NEXT STEPS
![Page 57: HTCondor User Tutorial - University of Wisconsin–Madisonpages.stat.wisc.edu/~jgillett/679/CHTC/ChristinaKochCHTC2nov17.pdf · 1 HIGH-THROUGHPUT COMPUTING AND YOUR RESEARCH Christina](https://reader031.vdocuments.site/reader031/viewer/2022030410/5a9864bb7f8b9a9c5b8d648c/html5/thumbnails/57.jpg)
57
Building up a workflow• Try to get ONE job running
• Follow steps that follow• Troubleshoot
• Check memory/disk requirements• Do a small scale test of 5-10 jobs
• Check memory + disk requirements *again*• Run full-scale set of jobs
![Page 58: HTCondor User Tutorial - University of Wisconsin–Madisonpages.stat.wisc.edu/~jgillett/679/CHTC/ChristinaKochCHTC2nov17.pdf · 1 HIGH-THROUGHPUT COMPUTING AND YOUR RESEARCH Christina](https://reader031.vdocuments.site/reader031/viewer/2022030410/5a9864bb7f8b9a9c5b8d648c/html5/thumbnails/58.jpg)
58
Special Considerations for Code• Your code should be able to take in different inputs (usually via an argument)
• Example (in R): # analysis.R
options <- commandArgs(trailingOnly=TRUE)input_file <- options[[1]]
# continue w/ input file
$ R CMD BATCH “--args input.tgz” analysis.R
![Page 59: HTCondor User Tutorial - University of Wisconsin–Madisonpages.stat.wisc.edu/~jgillett/679/CHTC/ChristinaKochCHTC2nov17.pdf · 1 HIGH-THROUGHPUT COMPUTING AND YOUR RESEARCH Christina](https://reader031.vdocuments.site/reader031/viewer/2022030410/5a9864bb7f8b9a9c5b8d648c/html5/thumbnails/59.jpg)
59
Special Considerations for R
Guide for R Jobs:chtc.cs.wisc.edu/r-jobs.shtml
Need to:1.prepare a portable R installation2. create a primary “executable” script that will run your R code3. submit jobs with different input for each
![Page 60: HTCondor User Tutorial - University of Wisconsin–Madisonpages.stat.wisc.edu/~jgillett/679/CHTC/ChristinaKochCHTC2nov17.pdf · 1 HIGH-THROUGHPUT COMPUTING AND YOUR RESEARCH Christina](https://reader031.vdocuments.site/reader031/viewer/2022030410/5a9864bb7f8b9a9c5b8d648c/html5/thumbnails/60.jpg)
60
1. Prepare portable R
1. download your required R version (.tar.gz)
2. follow CHTC’s guide to start an interactive “build” job:
chtc.cs.wisc.edu/inter-submit.shtml
3. install R and any libraries
4. pack up the resulting R installation directory
5. exit the interactive session
described in: chtc.cs.wisc.edu/r-jobs.shtml
S iB
![Page 61: HTCondor User Tutorial - University of Wisconsin–Madisonpages.stat.wisc.edu/~jgillett/679/CHTC/ChristinaKochCHTC2nov17.pdf · 1 HIGH-THROUGHPUT COMPUTING AND YOUR RESEARCH Christina](https://reader031.vdocuments.site/reader031/viewer/2022030410/5a9864bb7f8b9a9c5b8d648c/html5/thumbnails/61.jpg)
61
2. Create the Job “executable”
#!/bin/bash
# command line script (runR) to run R
# untar the portable R installation:tar –xzvf R.tar.gz
# make sure the above R installation will be used:export PATH=$(pwd)/R/bin:$PATH
# run R, with my R script as an argument to this executable:R CMD BATCH “--args $1” Rscript.R
![Page 62: HTCondor User Tutorial - University of Wisconsin–Madisonpages.stat.wisc.edu/~jgillett/679/CHTC/ChristinaKochCHTC2nov17.pdf · 1 HIGH-THROUGHPUT COMPUTING AND YOUR RESEARCH Christina](https://reader031.vdocuments.site/reader031/viewer/2022030410/5a9864bb7f8b9a9c5b8d648c/html5/thumbnails/62.jpg)
62
3. Create the submit file
executable = runR.sharguments = input.tgzlog = $(Process).logerror = $(Process).erroutput = $(Process).out
transfer_input_files = R.tar.gz, input_file, myscript.R
request_cpus = 1request_disk = 1GBrequest_memory = 1GB
queue