distributed systems, parallel computing
TRANSCRIPT
-
8/13/2019 Distributed systems, parallel computing
1/5
CS/SE 4F03 Distributed Computer Systems
Ned Nedialkov
Department of Computing and SoftwareMcMaster UniversityWinter 2014
Assignment 1This is a good online book
1 Introduction
This course is devoted to studying distributed parallel architectures, parallel algorithmsand programming. After covering basic concepts, like parallel programming platforms andmodels, we shall study parallel algorithm design and the Message Passing Interface (MPI).Our focus will be on designing and implementing efficient distributed parallel programs. Weshall also consider tuning and debugging MPI programs.
2 Objectives
At the end of this course, students should have thorough understanding of the concepts be-hind distributed systems and programming. Students should understand the bottlenecks indistributed parallel computing and should know how to design and implement efficient dis-tributed algorithms. In particular, students should be able to write non-trivial applicationsusing MPI on distributed machines.
3 Room/time
Lectures: Tuesday, Thursday, Friday 8:309:20 MDCL/1309Tutorial: Tuesday 11:30-12:20 BSB/105
Monday 13:30-14:20 BSB/105Office hours Friday 10:3011:30
4 Teaching assistantsThomas Gwosdz (email: gwosdzto)Jamie Turner (email: turnerjr)
1
http://www.cas.mcmaster.ca/~nedialk/COURSES/4f03/Lectures/A1.pdfhttp://tacc-web.austin.utexas.edu/veijkhout/public_html/Articles/EijkhoutIntroToHPC.pdfhttp://tacc-web.austin.utexas.edu/veijkhout/public_html/Articles/EijkhoutIntroToHPC.pdfhttp://www.cas.mcmaster.ca/~nedialk/COURSES/4f03/Lectures/A1.pdf -
8/13/2019 Distributed systems, parallel computing
2/5
5 Text
Peter Pacheco, An Introduction to Parallel Programming
6 Lectures1. Introduction to MPI Code
2. Collective communications I Code
3. Caching Code
4. Interconnection networks
5. Communication cost
6. Scalability
7. Collective communications Code
8. Parallel program design. Tasks, critical path
9. Nonblocking communications Code
10. Understanding communications Code
11. Examples
12. Data decomposition techniques
13. Array distribution schemes
14. Communicators Code
15. Topologies Code
16. Communicators and topologies: Foxs Algorithm
17. The Traveling salesman problem. Parallel distributed tree search
18. Distributed shortest paths
19. Advanced point-to-point communications Code
20. OpenMP
2
http://store.elsevier.com/An-Introduction-to-Parallel-Programming/Peter-Pacheco/isbn-9780123742605/http://store.elsevier.com/An-Introduction-to-Parallel-Programming/Peter-Pacheco/isbn-9780123742605/http://www.cas.mcmaster.ca/~nedialk/COURSES/4f03/Lectures/mpi-basics.pdfhttp://www.cas.mcmaster.ca/~nedialk/COURSES/4f03/Lectures/mpi-basics.ziphttp://www.cas.mcmaster.ca/~nedialk/COURSES/4f03/Lectures/collective-comms-1.pdfhttp://www.cas.mcmaster.ca/~nedialk/COURSES/4f03/Lectures/coll-comms-1.ziphttp://www.cas.mcmaster.ca/~nedialk/COURSES/4f03/Lectures/caches.pdfhttp://www.cas.mcmaster.ca/~nedialk/COURSES/4f03/Lectures/cache.chttp://www.cas.mcmaster.ca/~nedialk/COURSES/4f03/Lectures/interconnections.pdfhttp://www.cas.mcmaster.ca/~nedialk/COURSES/4f03/Lectures/communication-cost.pdfhttp://www.cas.mcmaster.ca/~nedialk/COURSES/4f03/Lectures/scalability.pdfhttp://www.cas.mcmaster.ca/~nedialk/COURSES/mpi/Lectures/lec2_2.pdfhttp://www.cas.mcmaster.ca/~nedialk/COURSES/mpi/Lectures/code2_2.tar.gzhttp://www.cas.mcmaster.ca/~nedialk/COURSES/4f03/Lectures/algdesign.pdfhttp://www.cas.mcmaster.ca/~nedialk/COURSES/mpi/Lectures/lec2_1.pdfhttp://www.cas.mcmaster.ca/~nedialk/COURSES/mpi/Lectures/code2_1.tar.gzhttp://www.cas.mcmaster.ca/~nedialk/COURSES/mpi/Lectures/lec3_0.pdfhttp://www.cas.mcmaster.ca/~nedialk/COURSES/mpi/Lectures/code3_0.tar.gzhttp://www.cas.mcmaster.ca/~nedialk/COURSES/4f03/Lectures/private/examples.pdfhttp://www.cas.mcmaster.ca/~nedialk/COURSES/4f03/Lectures/algdesign2.pdfhttp://www.cas.mcmaster.ca/~nedialk/COURSES/4f03/Lectures/distschemes.pdfhttp://www.cas.mcmaster.ca/~nedialk/COURSES/mpi/Lectures/lec3_1.pdfhttp://www.cas.mcmaster.ca/~nedialk/COURSES/mpi/Lectures/code3_1.tar.gzhttp://www.cas.mcmaster.ca/~nedialk/COURSES/mpi/Lectures/lec3_2.pdfhttp://www.cas.mcmaster.ca/~nedialk/COURSES/mpi/Lectures/code3_2.tar.gzhttp://www.cas.mcmaster.ca/~nedialk/COURSES/4f03/Lectures/matrixmult.pdfhttp://www.cas.mcmaster.ca/~nedialk/COURSES/4f03/Lectures/tsp/tsp.pdfhttp://www.cas.mcmaster.ca/~nedialk/COURSES/4f03/Lectures/shortestpath.pdfhttp://www.cas.mcmaster.ca/~nedialk/COURSES/mpi/Lectures/lec5.pdfhttp://www.cas.mcmaster.ca/~nedialk/COURSES/mpi/Lectures/code5.tar.gzhttp://www.elsevierdirect.com/companions/9780123742605/LS/Chapter_5.ppthttp://www.elsevierdirect.com/companions/9780123742605/LS/Chapter_5.ppthttp://www.cas.mcmaster.ca/~nedialk/COURSES/mpi/Lectures/code5.tar.gzhttp://www.cas.mcmaster.ca/~nedialk/COURSES/mpi/Lectures/lec5.pdfhttp://www.cas.mcmaster.ca/~nedialk/COURSES/4f03/Lectures/shortestpath.pdfhttp://www.cas.mcmaster.ca/~nedialk/COURSES/4f03/Lectures/tsp/tsp.pdfhttp://www.cas.mcmaster.ca/~nedialk/COURSES/4f03/Lectures/matrixmult.pdfhttp://www.cas.mcmaster.ca/~nedialk/COURSES/mpi/Lectures/code3_2.tar.gzhttp://www.cas.mcmaster.ca/~nedialk/COURSES/mpi/Lectures/lec3_2.pdfhttp://www.cas.mcmaster.ca/~nedialk/COURSES/mpi/Lectures/code3_1.tar.gzhttp://www.cas.mcmaster.ca/~nedialk/COURSES/mpi/Lectures/lec3_1.pdfhttp://www.cas.mcmaster.ca/~nedialk/COURSES/4f03/Lectures/distschemes.pdfhttp://www.cas.mcmaster.ca/~nedialk/COURSES/4f03/Lectures/algdesign2.pdfhttp://www.cas.mcmaster.ca/~nedialk/COURSES/4f03/Lectures/private/examples.pdfhttp://www.cas.mcmaster.ca/~nedialk/COURSES/mpi/Lectures/code3_0.tar.gzhttp://www.cas.mcmaster.ca/~nedialk/COURSES/mpi/Lectures/lec3_0.pdfhttp://www.cas.mcmaster.ca/~nedialk/COURSES/mpi/Lectures/code2_1.tar.gzhttp://www.cas.mcmaster.ca/~nedialk/COURSES/mpi/Lectures/lec2_1.pdfhttp://www.cas.mcmaster.ca/~nedialk/COURSES/4f03/Lectures/algdesign.pdfhttp://www.cas.mcmaster.ca/~nedialk/COURSES/mpi/Lectures/code2_2.tar.gzhttp://www.cas.mcmaster.ca/~nedialk/COURSES/mpi/Lectures/lec2_2.pdfhttp://www.cas.mcmaster.ca/~nedialk/COURSES/4f03/Lectures/scalability.pdfhttp://www.cas.mcmaster.ca/~nedialk/COURSES/4f03/Lectures/communication-cost.pdfhttp://www.cas.mcmaster.ca/~nedialk/COURSES/4f03/Lectures/interconnections.pdfhttp://www.cas.mcmaster.ca/~nedialk/COURSES/4f03/Lectures/cache.chttp://www.cas.mcmaster.ca/~nedialk/COURSES/4f03/Lectures/caches.pdfhttp://www.cas.mcmaster.ca/~nedialk/COURSES/4f03/Lectures/coll-comms-1.ziphttp://www.cas.mcmaster.ca/~nedialk/COURSES/4f03/Lectures/collective-comms-1.pdfhttp://www.cas.mcmaster.ca/~nedialk/COURSES/4f03/Lectures/mpi-basics.ziphttp://www.cas.mcmaster.ca/~nedialk/COURSES/4f03/Lectures/mpi-basics.pdfhttp://store.elsevier.com/An-Introduction-to-Parallel-Programming/Peter-Pacheco/isbn-9780123742605/ -
8/13/2019 Distributed systems, parallel computing
3/5
7 MPI on CAS machines
We have 10 servers set up for MPI; all running RHEL 6.5. Hostnames are mpihost01 tompihost10. mpihost01 is the head node. You should ssh to it. Then set up passwordless ssh:
ssh-keygen -t rsacp .ssh/id_rsa.pub .ssh/authorized_keys
The machines and number of cores are
# Intel Xeon E7-L8867 2.13GHz
mpihost01 slots=16
# Intel Xeon E5-2960 2.6GHz
mpihost02 slots=4
mpihost03 slots=4
# AMD Opteron 8218 2.6GHz
mpihost04 slots=4
# AMD Opteron 8356 2.3GHz
mpihost05 slots=8
mpihost06 slots=8
# Intel Xeon X5482 3.2GHz
mpihost07 slots=4
mpihost08 slots=4
mpihost09 slots=4mpihost10 slots=4
8 Resources
MPI standard
MPI: The Complete Reference
MPI tutorial
Performance topics
Performance analysis tools
3
http://www.mcs.anl.gov/research/projects/mpi/http://www.netlib.org/utk/papers/mpi-book/mpi-book.htmlhttps://computing.llnl.gov/tutorials/mpi/https://computing.llnl.gov/tutorials/mpi_performance/https://computing.llnl.gov/tutorials/performance_tools/https://computing.llnl.gov/tutorials/performance_tools/https://computing.llnl.gov/tutorials/mpi_performance/https://computing.llnl.gov/tutorials/mpi/http://www.netlib.org/utk/papers/mpi-book/mpi-book.htmlhttp://www.mcs.anl.gov/research/projects/mpi/ -
8/13/2019 Distributed systems, parallel computing
4/5
9 Grading
Assignment 1 8%Assignment 2 12%Assignment 3 15%
Midterm 20%Project 45%
10 Tentative Schedule
Assignment 1 21 Jan 31 JanAssignment 2 31 Jan 13 FebMidterm 13 MarchAssignment 3 25 Feb 11 MarchProject 13 March 8 April
11 Course Policy
Course-related announcement will be at the course web site. You are responsible forchecking it regularly.
The assignments will be due at the beginning of the lectures. No assignments will beaccepted after the end of the lecture on the date when they are due.
Requests for remarking of an assignment or the midterm exam must be made withinone week after the marked assignment/midterm is returned. Requests that are laterthan a week will not be accommodated.
The assignments will be marked by the TAs. Any request for remarking must be firstdirected to the TA that has marked your assignment problem(s). After you have talkedto your TA, and you still believe that you deserve a higher mark, then you can contactme.
You are allowed to discuss the problems from the assignments. However, you mustsubmit your own work.
Assignments that are very similar may lose half of their marks. Identical solutions tothe same problem will receive zero marks.
12 Academic Dishonesty/Ethics
Students are reminded that they should read and comply with the Statement on AcademicEthics and the Senate Resolutions on Academic Dishonesty as found in the Senate PolicyStatements distributed at registration and available in the Senate Office.
These policies are also available at http://www.mcmaster.ca/policy/ac_ethics.htm.
4
http://www.mcmaster.ca/policy/ac_ethics.htmhttp://www.mcmaster.ca/policy/ac_ethics.htm -
8/13/2019 Distributed systems, parallel computing
5/5
13 Faculty Notices
The Faculty of Engineering is concerned with ensuring an environment that is free of alldiscrimination. If there is a problem, individuals are reminded that they should contact theDepartment Chair, the Sexual Harassment Officer or the Human Rights Consultant, as the
problem occurs.
5