User Guide of High Performance Computing Cluster in School of Physics
User Guide of High Performance Computing Cluster in School of Physics
User Guide of High Performance Computing Cluster in School of Physics
in School of Physics
This document aims at helping users to quickly log into the cluster, set up the software
environment and get their jobs run. It does not cover clusters hardware and operation system
(eg. linux shell, file systems, etc.), nor does it cover the usage of software development tools and
parallel programming. If required, those topics may be included in future.
headnode.physics.usyd.edu.au
You can use secure shell (ssh) to connect to the cluster (using either short name or full name):
Software environments
You often need to set up software environment for a program you wish to use on a computer
system. For example, add PBS to your search PATH. This can be specified in your login shell
script file, .cshrc or .bashrc. If you have done it, you can keep using it. Otherwise, you are
encouraged to use Environment Modules package for this purpose.
The package provides a great way to easily customize your shell environment, especially on the
fly. To find the list of software for which your environment needs to be set up for using the
software, enter module avail,
1
module load PBS
So that, each time when you login, configuration for PBS will be done for you. You can then run
PBS commands such as qsub, qstat etc. or man qsub to get information about qsub.
You may need to unload a package before loading another one to avoid conflict. For example,
you have already set up openMPI with Intel compilers and now want to use it with GNU
compilers, do this,
command module help or man module should give you some information about how to use
Environment Modules package.
What is PBS; Basic PBS user commands; Available queues on the cluster; Job submission and
Job Script Template; Tips for specifying resources; Additional job script templates; Monitoring
jobs; Interactive jobs, Array jobs
What is PBS
PBS is a distributed workload management system. As such, PBS handles the management of
computational workload on a set of compute nodes. PBS plays three primary roles: queuing,
scheduling and monitoring jobs. From the user's perspective, PBS allows you to make more
efficient use of your time. You specify the tasks you need executed. The system takes care of
running these tasks and returning the results to you. If the available compute nodes are full, then
PBS holds your work and runs it when the resources are available.
you create a batch job which you then submit to PBS. A batch job is a file (a shell script)
containing a set of commands you want to run on a set of execution machines. It also contains
directives which specify the characteristics (attributes) of the job, and resource requirements (e.g.
number of processors, amount of memory and length of time) that your job needs. Once you
create your PBS job, you can reuse it if you wish. Or, you can modify it for subsequent runs.
2
display job status for user uid only
all PBS client commands are in headnode:/usr/physics/torque/bin. Use man page for detail usage of
each command.
Queue name for all physics users (jobs will run on node 01-16 and 31- 35):
physics
Queue name for Complex Systems users (jobs will run on node 21- 23):
yossarian
Queue name for Medical Physics users (jobs will run on node 01-16 and 31- 35):
hippocrates
Queue name for Condensed Matter Theory users (jobs will run on node 41 - 45):
cmt
where run-job.csh is a batch job which contains qsub options and commands/programs that you
want to run. Here is what an example run-job.csh has:
#!/bin/csh
#PBS -N MyJobName
#PBS -o demo.txt
#PBS -j oe
3
#PBS -q yossarian
#PBS -l nodes=1:ppn=4
#PBS -l walltime=00:01:00
#PBS -m ea
#PBS M [email protected]
#PBS -V
cd "$PBS_O_WORKDIR"
#your commands/programs start here, for example:
hostname
exit
If you submit this job, it will generate a file demo.txt with the hostname of the node it ran on
printed. The output may contain harmless TTY warnings related to using tcsh rather than bash.
Notes of above example run-job.csh are as follows:
#!/bin/csh
This indicates it will run C shell.
-N MyJobName
The name for your job
-o demo.txt
The filename to write standard output from your job
-j oe <optional>
Merges stdout and stderr into the output file. Otherwise, PBS will automatically create a
separate error log
-q yossarian
Select which PBS queue to use. Use the queue corresponding to your group
-l nodes=1:ppn=4
Specify the CPU resources required, 4 processors on 1 node specified here.
-l walltime=00:01:00
maximum wall time requested to run job, 1 minute specified here. Warning: if the job
hasnt finished when the time reaches this walltime, you job will be killed.
-m ea
Sends a notification email when job ends/aborts.
M [email protected]
your email address specified here.
-V
Declares that all environment variables in the qsub commands environment are to be
exported to the batch job. If this directive is missed, your job may be terminated because
eg. $TERM is not set.
cd "$PBS_O_WORKDIR"
change to the directory where (the variable $PBS_O_WORKDIR contains the path from
which) you submit this file.
4
The main cluster resources are compute nodes and processors, memories and execution time.
Multiple users share the resources on the cluster. The general advice is to request resources as
accurate as your job need. As seen above, you specify the number of nodes and number of
processors with the option l nodes=*:ppn=*. nodes designates how many nodes your job should
be executed on, and ppn specifies the number of processors that will be allocated one each node.
For example,
-l nodes=1:ppn=1 1 processor on 1 node. This is what you should use for a non-parallel program
-l nodes=2:ppn=1 1 processor per node, for a total of 2 processors
-l nodes=1:ppn=14 14 processors on 1 node. This option will cause the queue to reject your job
because no nodes have enough processors
PBS will reserve the number of nodes and processors you have specified for your job no matter
how many processors your job actually run on. These nodes and processors will not be given
any new tasks when your job is running. On the other hand, if you request l nodes=1:ppn=1 for
a Matlab job which uses a matlabpool of size 8 (it will run on 8 processors), PBS wont know
your matlab program uses 8 processors and may assign some processors on the node to other
jobs. Your job and the other jobs will share 7 processors and this will cause all the jobs to slow
down. Therefore, it is important that you request correct number of nodes and processors for
your job.
Each node has about 32GB swap space, which means that when jobs use up all physical
memories, memory swapping will occur to keep jobs running. Memory swapping will slow
down all jobs running on the node, too. You can reserve certain physical memory by specifying
l mem=??MB or l mem=??GB (maximum amount of physical memory used by the job) to
avoid using swap space. For example,
A little trial and error may be required to find how much memory your job is using. Your job
will only run if there is free memory as sufficient (more than 3GB in above example) as
requested so making a sensible memory request will allow your job to run sooner. If your job
needs memory more than what you have specified, the job will terminates when it reaches mem.
Users may reserve more memory for his job by simply requesting all/more processors instead of
specifying requested memory size on a node. It is ok, but actually blocks jobs of other users with
less memory usage to run.
It is recommended that you use l walltime=* instead of l cput=* to specify how much time your
program is allowed to run for. walltime literally refers to wall time, the amount of time that a
clock on the wall shows (as opposed to CPU time, the time all processors actually spends on a
task). After it reaches walltime, your job will be terminated by PBS. It is always best to make
as accurate this request as possible.
5
#!/bin/csh
#PBS -q physics
#PBS -l walltime=1:00:00
#PBS -l nodes=1:ppn=4
#PBS -V
#PBS -N test-matlab
#PBS -m ea
#PBS -M [email protected]
#PBS -j oe
#PBS -o output.txt
cd ${PBS_O_WORKDIR}
# run matlab file yourMatlabScripts.m:
matlab -nodisplay -r "yourMatlabScript, exit"
#!/bin/csh
#PBS -q physics
#PBS -l walltime=10:00:00
#PBS -l nodes=4:ppn=2
#PBS -V
#PBS -N test-mpi
#PBS -m ea
#PBS -M [email protected]
#PBS -j oe
#PBS -o output.txt
cd "$PBS_O_WORKDIR"
mpirun -n 8 yourMPIcode # n = nodes x ppn (see resource request)
exit
Monitoring jobs
Use command qsub n to view all submitted jobs status. Alternatively, you can monitor
execution of your job by using qload or qtop.
By default, qload shows you a list of all jobs currently in the queue, a summary of which users
are using the system and information on workload over the cluster. For example,
USER LOAD
1- SXY/XUE YANG (25 CORES) 1h 00m 00s remaining
6
node12 |------------| (12) 0.0 30.5GB node44 |xxxxx-----------| (11) 5.0 124.2GB
node13 |------------| (12) 0.0 30.5GB node45 |xxxxx-----------| (11) 4.9 124.1GB
node21 |-------------------------------| (31) 0.0 186.8GB
node22 |-------------------------------| (31) 0.0 186.8GB
node23 |-------------------------------| (31) 0.0 186.8GB
Your jobs are colored red in the node availability report, so you can see which nodes your job is
running on.
If you want to delete your job before it finishes, use the qdel command and provide your Job ID
from qload. To remove the job Sensor owned by sxy as shown above, user sxy would run,
Interactive jobs
You can start an interactive session via PBS by using qsub I. This will create an interactive
job, and you will be given a shell on a compute node as though you had used ssh. For example:
node02: ~ >
When using an interactive job, you can specify the number of nodes and CPUs to lock out
(although requesting a number of nodes for an interactive job is only useful if you are going to be
using mpirun). For example
would start an interactive job that locks out an entire node. Interactive jobs will also appear in
qload. Please do not use interactive jobs to perform unattended runs (e.g. with batch or screen).
Interactive jobs are ONLY for attended interactive use. By default, interactive jobs will terminate
after 1 hour. You can set the walltime variable with l flag to increase this, just the same as in the
PBS script file. Please do not start interactive jobs with excessive walltime requests.
7
Array jobs
Array jobs are one of the most powerful features of PBS for single-CPU jobs, are a very
compelling reason for many users to learn and switch to the PBS system. They are useful when
you want to run the same program many times, operating on different input files or with different
input arguments. Array jobs allow you to quickly submit all of the jobs at once, and will run
several instances of your job at the same time. For example, suppose I had a directory with files
data1.csv, data2.csv and data3.csv, and I wanted to run my program myprog FILE on each of
them. I can do this very easily using the -t option
#!/bin/csh
#PBS -N MyJobName
#PBS -o demo.txt
#PBS -q yossarian
#PBS -l nodes=1:ppn=4
#PBS -l walltime=00:01:00
#PBS -m ea
#PBS M [email protected]
#PBS -V
#PBS -t 1-3
cd "$PBS_O_WORKDIR"
myprog data${PBS_ARRAYID}.csv
The -t switch instructs PBS to submit this as an array job. You can specify a range of indices (1-
3) or individual indices (1,3,5). For each index, PBS creates a separate job. Submitting this script
will cause 3 job to be created, each of them requesting 4 CPUs on 1 node. The variable
$PBS_ARRAYID stores the value of the array index in each submitted job. So each of the 3 jobs
will run with a different value of $PBS_ARRAYID. In this way, myprog will run on each of the
3 data files, even though only one script was submitted to PBS. You can of course do fancier
things with the index, like use more sophisticated scripting to operate on the array ID before
calling your program etc.
Another useful way to use the array ID is as an argument to a Matlab function. For example, if
the command in the PBS script was
then mymatlabscript.m would be run for each of the different array ID values. You can then
write code in Matlab to decide what each of the array ID values will do.