getting acquainted to pdc

30
PDC Enabling Science Getting acquainted to PDC Nils Smeds <[email protected]. se

Upload: ziva

Post on 31-Jan-2016

49 views

Category:

Documents


0 download

DESCRIPTION

Getting acquainted to PDC. Nils Smeds

TRANSCRIPT

Page 1: Getting acquainted to PDC

PDCEnabling Science

Getting acquainted to PDC

Nils Smeds

<[email protected]

Page 2: Getting acquainted to PDC

PDCEnabling Science

Using PDC resources

• Acquiring informationhttp://www.pdc.kth.se• Guided Tours

http://www.pdc.kth.se/support/• AFS

• Strindberg IBM-SP

• Helpdesk• FAQs

• Contact information

[email protected]

• 08-790 7800

Page 3: Getting acquainted to PDC

PDCEnabling Science

http://www.pdc.kth.se/doc

Page 4: Getting acquainted to PDC

PDCEnabling Science

http://www.pdc.kth.se/support/sp2-tour.html

Page 5: Getting acquainted to PDC

PDCEnabling Science

http://www.pdc.kth.se/compresc/

Page 6: Getting acquainted to PDC

PDCEnabling Science

Your environment as a user

• File systems• AFS — Home directories• GPFS — Parallel file system (IBM SP)• HSM — Hierarchical storage management• /scratch — scratch file systems

• Modules — handles $PATH, $MANPATH• module add sp2 local • module show local

• E-mail — when you leave• Create $HOME/.forward publicly readable

http://www.pdc.kth.se/support/misc-tour.html#EMAIL

Page 7: Getting acquainted to PDC

PDCEnabling Science

Kerberos commands

• kauth — Proves your identity> ./kauth -n [email protected] -l 60

• klist — List your kerberos tokens> ./klist

Ticket file: /tmp/tkt58016Principal: [email protected] Issued Expires Principal

Mar 9 12:18:59 Mar 9 13:18:59 krbtgt.NADA.KTH.SE Mar 9 12:19:25 Mar 9 13:18:59 rcmd.r11n07.pdc.kth.se

• kdestroy — removes your ticket file> ./klist Ticket file: /tmp/tkt58016 klist: No ticket file (tf_util)

• kpasswd — change your passwd

Page 8: Getting acquainted to PDC

PDCEnabling Science

Commands that rely on kerberos

• Getting a shell > ./rxtelnet -l username strindberg.pdc.kth.se> ./telnet -l username strindberg.pdc.kth.se

• Transferring files > ./ftp strindberg.pdc.kth.seConnected to r11n07-f.pdc.kth.se220 r11n07.pdc.kth.se FTP server ready.Name (strindberg.pdc.kth.se:smeds): <RET>S:232- //PDC// S:232- //PDC// Welcome to Strindberg, an IBM SP ... S:232 User smeds logged in.ftp> kauthPassword for [email protected]: mypasswordS:200 Tickets will be destroyed on exitftp> binaryftp> put filename.datftp> get otherfile.dat ftp> quit

Page 9: Getting acquainted to PDC

PDCEnabling Science

AFS commands

http://www.pdc.kth.se/support/afs-tour.html• tokens — List your afs tokens – -smeds> [email protected]'s Password: mypasswordsmeds> unlogsmeds> tokensTokens held by the Cache Manager: --End of list--smeds> afslogsmeds> tokensTokens held by the Cache Manager:(AFS ID 22557) tokens for [email protected] [Expires Aug 19 03:38](AFS ID 22557) tokens for [email protected] [Expires Aug 19 03:38] --End of list--smeds>

• kauth/kdestroy automatically does afslog/unlog

Page 10: Getting acquainted to PDC

PDCEnabling Science

More AFS

• fs — Directory access managementsmeds> fs setacl directoryname username rlsmeds> fs listacl directorynamesmeds> fs setacl directoryname username nonesmeds> fs setacl directoryname system:anyuser rlsmeds> fs helpsmeds> fs setacl -h

• pts — ACL group managementsmeds> pts mem usernamesmeds> pts creategroup username:bs106smeds> pts adduser mybuddy mygroupsmeds> pts examine mygroupsmeds> pts adduser -h

• Putting it all togethersmeds> fs setacl MyProject smeds:buddies rlsmeds> fs setacl MyProject smeds:REALbuddies rlidwk

Page 11: Getting acquainted to PDC

PDCEnabling Science

HSM usage

• Use tar to pack many files into one file which can be saved in HSM

smeds> module add hsmsmeds> hsmls -lsmeds> tar cvf /scratch/MyAnalysis.tar Results-980812/Run1/smeds> hsmcopyto /scratch/MyAnalysis.tar Res-980812-1.tar

• HSM location is kallsup:/hsm/home/u/username/..., see output from hsmmyhome

• You may use kerberized rcp to move files to and from this location.

• On line help is available:smeds> hsmls -hsmeds> hsmcopyfrom -h

Page 12: Getting acquainted to PDC

PDCEnabling Science

Strindberg usage

• http://www.pdc.kth.se/cgi-bin/strindberg-usage.pl

Page 13: Getting acquainted to PDC

PDCEnabling Science

Node types

• http://www.pdc.kth.se/compresc/hardware/• Batch nodes (T)

• 160 MHz (640 MFlop/s), 256MB RAM, 2 GB /scratch

• Batch nodes with more memory (W,Z)• 160 MHz (640 MFlop/s), 512/1024 MB RAM, 2 GB /scratch

• 4-way SMP nodes (M)• 4332 MHz (4664 MFlop/s), 512 MB RAM, 4 GB /scratch

• 8-way SMP nodes (N,H)• 8222 MHz (8888 Mflop/s), 4/16 GB RAM

• Serial nodes (G, S)• One 135MHz wide node w. 2 GB RAM, some 67MHz nodes

Page 14: Getting acquainted to PDC

PDCEnabling Science

• Login node(s)

• The node of the SP that you are connected to after./rxtelnet -l name strindberg.pdc.kth.se./rxtelnet -l name august.pdc.kth.se./rxtelnet -l name nf01r01.pdc.kth.se

• Interactive nodes

• Nodes that are shared among several users. Used for eg debugging and compiling. spattach -i -p#

• These nodes must be used with IP communication:export MP_EUILIB=ip

• Dedicated (or batch) nodes• Nodes used for production codes and/or longer pre/post-jobs

spsubmit -p# -t time -c CAC scriptfilespattach -p# -t time -c CAC

Page 15: Getting acquainted to PDC

PDCEnabling Science

A full interactive example

rxtelnet strindberg.pdc.kth.se

(New window)

klist

kauth

cd workdir

mpcc -g -o myprog myprog.c

spattach -i -p5

(wait)

./myprog

./myprog -procs 3

./myprog -procs 3 -stdoutmode ordered -labelio yes

Page 16: Getting acquainted to PDC

PDCEnabling Science

Interacting with the EASY scheduler

smeds> spsubmit -hspsubmit [-h][-inWvCb][-c cac][-I#][-s#][[-p# -t#][-j#][-M]

file[args]] -h: help-p processors: number of processors. (Example: -p2W)-t minutes: number of minutes (Wall-clock)-j Job Type: available job types mpi, task, pvm3...-c CAC: optional, submit for accounting group cac.-I InitialDir: optional, default current working

directory.-b: optional, hold job until all jobs completed-i: optional, use IP instead of UserSpace.-v: optional, verbose.-C: optional, commit before submit.-s Filename: optional, save EASY generated script

[...]program: executable or script.args: optional arguments to program.

User smeds can specify: staff free ta.smeds

Page 17: Getting acquainted to PDC

PDCEnabling Science

spsubmit examples

• Submitting an MPI programsmeds> spsubmit -p 4T -t 30 -j mpi ./mympiprog

• Saving the generated script for later re-usesmeds> spsubmit -p 4T -t 30 -j mpi -s myscript.esy ./mympiprog

• To have a mix of nodes and start on a Z-nodesmeds> spsubmit -p 1Z8T -t 30 -j mpi ./mympiprog "arg1 'arg2 here'"

• Redirecting STDOUT for your programsmeds> spsubmit -p 4T -t 30 -j mpi ./mympiprog "> job.out"

• Submitting an MPI programsmeds> spsubmit -p 4T -t 30 -j mpi ./mympiprog

Page 18: Getting acquainted to PDC

PDCEnabling Science

The batch script file

#!/bin/bash#------ Customizable part ------# (Use submitting directory as working directory)cd $SP_INITIALDIROUT=MyProgram.out#------ End customizable part ------#------ Generic part ------PROGRAM="MyProgram" ; PROGRAMDIR="$HOME/Public/MyProgramDir"export MP_HOSTFILE=$SP_HOSTFILEexport MP_PROCS=$SP_PROCSexport MP_EUILIB=us ; export MP_EUIDEVICE=css0export MP_INFOLEVEL=0export MP_CSS_INTERRUPT="yes”

export TMPDIR=/scratch

echo "Executing $PROGRAM in directory `pwd` at `date`"poe ${PROGRAMDIR}/${PROGRAM} > $OUTecho "Program finished `date`"

Page 19: Getting acquainted to PDC

PDCEnabling Science

http://wwww.pdc.kth.se/info/qwatch/

• A snapshot from different queues at PDC

• Updated at regular intervals• It is generated from the same

information you get using the command spq

smeds> spq -a

smeds> spq -r

smeds> spq -u smeds

Page 20: Getting acquainted to PDC

PDCEnabling Science

http://www.pdc.kth.se/sp/sptetris/

Page 21: Getting acquainted to PDC

PDCEnabling Science

Scheduling limits

smeds> spq -hUsage: spq [-h] [-l] [-L] [-r] [-q] … ...smeds> spq -l NICKNAME SATURATE CAC NJOB Wall Total- - weekend - r1149 1 16h- - weekend saturate tf109 3 169h40- - night saturate gw11 4 48h40- - day - gw11 1 3h30 . . .smeds> spq -LINTERVAL NICKNAME MAXNJOB MAXWALLTIME[15h,60h] weekend - - - 30h[4h,15h] night - - - 16h[1h,4h] day - - - 16h . . .[0m01s,2h] Nshort - - 4 -

Page 22: Getting acquainted to PDC

PDCEnabling Science

The concept of CACs

• Computer cycle "accounts" smeds> cac members smedsCAC groups smeds is a member of: ta.smeds staff summer-2000 freesmeds> cac -hsmeds> spjobsummary -c summer-2000usr jid req npe treq tstart r-cpu ucpusmeds ###### 1G2Z2T 5 0h30 yyhhmm 2h30 1h49mike ###### 4T 4 0h15 yyhhmm 1h 0h56 . . .smeds> cac -hsmeds> spjobsummary -u smeds -f 200003 -lsmeds> spjobsummary -hsmeds> spsummary -h

Page 23: Getting acquainted to PDC

PDCEnabling Science

Compilers (IBM SP)

• cc, mpcc• IBM C-compiler, mpcc adds special flags for compiling MPI

parallel programs. Include file search path, tags binary to be parallel etc.

• xlC, mpCC• IBM C++-compiler. Not fully ANSI compliant.

• xlf, mpxlf, xlf90, mpxlf90, f90• Fortran, Fortran90/95 compilers

• Reentrant code generation (thread safety)• xlc_r, xlf_r, mpxlf90_r …

• OpenMP directives only available in Fortran currently

Page 24: Getting acquainted to PDC

PDCEnabling Science

Code optimization

• -O2 -O3• Code restructure. Code in-lining. Level 3 may cause arithmetic

reorganization.

• -qhot• Higher order transformations of generated code. Uses cache size

information. Occasionally slows code down.

• -qipa, -O4, -O5• Interprocedure analysis. Mainly code inlining across file

boundaries. -O4 => -O3 -qhot -qipa -qtune=arch -qcache=arch

• -qsmp=omp, -qsmp=auto, -qreport=smplist• All of the above. Long compile time. Needs thorough checking of

results

Page 25: Getting acquainted to PDC

PDCEnabling Science

The lab session

• The object of the exercise is to get familiar to the PDC environment by a hands on experience

• The lab session has three parts• Install a kerberos travel kit

• The workstation is a Sun Solaris 2.6 workstation

• Install a travel kit and verify that you can use that to log in to the SP

• Experiment with file systems and storage media• Try AFS, tokens and ACLs

• Use the HSM data migration system

• Run a Fortran90 program on the IBM SP2• Serial and parallel - interactive and batch

• Play, experiment, think and ask!

Page 26: Getting acquainted to PDC

PDCEnabling Science

Topics that can not be covered in this talk

• Compiler options• Optimization options, linker options, file name convention options

• Programming tools• Tracing, Sampling, Debugging, F90 conversion• See http://www.pdc.kth.se/compresc/software• Totalview, Foresys, Vampir, Dimemas

• Running parallel programs on other computers• Running MPICH in the NADA computer lab rooms

Page 27: Getting acquainted to PDC

PDCEnabling Science

Totalview and “How to trick the OS”

• Have the program read from the keyboard as early in the program flow as possible after MPI_Init()

• Start the process and attach to the running poe processmodule add totalview

./myprog & (Start your program)

totalview -no_stop_all & (Or start totalview in other window)• "Show all unattached processes"• Attach to the poe process, the debugger locates all MPI processes• Select one of the MPI-processes (not the poe process)• Set break-points later in the program flow if you want• In one of the MPI-process windows say "Go group <G> " • Give the program the input it is waiting for

Page 28: Getting acquainted to PDC

PDCEnabling Science

Running MPI programs on SUNs (locally)

• MPICH - 1.2.0• Argonne National Laboratories

• Reference MPI implementation

• module add workshop/5.0• module add mpich/1.2• mpicc -o myprog-sun myprog.c• mpirun -np 4 -machinefile LOCAL ./myprog-sun

• The machinefile is reused up to the number of processes requested by -np

• Further information on MPI at KTH:http://www.nada.kth.se/datorer/unix/

LOCALred01.nada.kth.sered01.nada.kth.se

Page 29: Getting acquainted to PDC

PDCEnabling Science

Running MPI programs on SUNs (remote)

• Running on several hosts• You may need to set up AFS tokens on the remote

hosts

• kauth -h red02.nada.kth.se -l 30kauth -h red03.nada.kth.se -l 30kauth -h red04.nada.kth.se -l 30kauth (For your local rights)

• mpirun -np 4 -machinefile RED ./myprog-sun

• mpirun -np 6 -nolocal -machinefile RED ./myprog-sun

• The remote processes are started of by a kerberos rsh to the remote host. Modern kerberos rsh has a call to the command afslog in them.

• The remote ticket must be there in advance

REDred02.nada.kth.sered03.nada.kth.sered04.nada.kth.se

Page 30: Getting acquainted to PDC

PDCEnabling Science