Submit a job

IMPORTANT NOTICE

You must not execute your code/software directly on the login-hpc machine.

You must work on a compute node. To do so, you must first request to use one or several machines via a job scheduler named Slurm.

Your request can be included into a file. Then you need to launch this command: sbatch name_of_your_file on login-hpc to request an allocation of resources that will execute your code.

For short duration executions, please refer to the interactive mode section.

You can find more details in the sections below.

Configure your resource reservation

Reservations on the cluster are made via the Slurm Scheduler, which puts the submitted jobs in queue until resources are allocated.

Important: When submitting a job, it is required to mention the following information:

  • account= the name of the group/project assigned to you by the scientific committee (received by e-mail)
  • partition= the queue on which your job will be sent. By default, if no partition is specified, the job is sent to cpucourt. Make sure to choose the right partition according to the needs of your job (time, resources). See the description of the partitions here.
  • time= a walltime after which your job will be stopped by Slurm (with time =dd-hh:mm:ss). The walltime of your job must be less than or equal to the maximum walltime defined for the partition on which you submit your job (view the limits here).
  • the resources you wish to reserve (by default, 1 core). In particular, you can use:
    • -n ou ntasks= for the number of tasks to be run. By default, Slurm allocates one core per task. If your job is sequential (no parallelization), ntasks should be equal to 1. Give a value greater than 1 for the MPI jobs.
    • cpus-per-task= to be used with OpenMP jobs. Use this option to specify the desired number of threads (to be associated with ntasks=1). Slurm allocates one core per cpu.
    • ntasks-per-node= number of tasks to be performed on the same node.
    • ntasks-per-socket= number of tasks to be performed on the same processor. As a reminder, the nodes of the cpucourt and cpulong queues have 2 processors of 20 cores each.
    • -N ou nodes= for the number of nodes. If not specified, allocation will be based on other resource options. If you only specify the number of nodes, with the cpucourt queue (which is not exclusive) this will reserve by default 1 core per reserved node. As the cpulong queue is exclusive, if you reserve 1 node, you reserve all of its cores, even if you do not use them all
    • constraint=intel or amd for the type of CPU you want to use (more details here)

Warning: even if your job has not finished executing once the walltime has been reached, your job will be automatically stopped. It is therefore strongly recommended that you carefully estimate the execution time of your job and introduce checkpoints in your code.

Submit a job

In a script

To submit your job: sbatch file_containing_your_script

The full list of options when submitting a job can be found here.

Example of script named “myJob” which allows to execute 5 tasks (here a task will take 1 core). Resources will be reserved for a maximum of 2 hours and the job will be sent to the short job queue (cpucourt).

#!/bin/bash

#SBATCH --job-name=myJob
#SBATCH --output=output.txt
#SBATCH --ntasks=5
#SBATCH --time=0-02:00:00 #SBATCH --account=your_project_name
#SBATCH --partition=cpucourt
#SBATCH --mail-user=...@univ-cotedazur.fr
#SBATCH --mail-type=BEGIN,END,FAIL

module purge
module load ...

my commands to run the job ...

Although optional, the commands mail-user et mail-type allow you to receive email notifications about the status of your job.

The modules allow you to dynamically modify your environment variables needed to run your code (essentially PATH, LD_LIBRARY_PATH or MAN_PATH), depending on the module you are loading. For more information on using the modules, click here. The complete list of modules installed on the cluster is available here.

OpenMP jobs

With OpenMP, parallelization can only be done between the different threads of a single node. It is therefore recommended to set ntasks to 1 so that the code execution is started only once. The number of desired threads is to be defined with cpus-per-tasks, which corresponds to the number of cores that will be allocated on this node. Here is an example:

#!/bin/bash

#SBATCH --job-name=myJob #SBATCH --nodes=1 #SBATCH --ntasks=1
#SBATCH --cpus-per-task=12
#SBATCH --time=0-02:00:00 #SBATCH --account=projectname
#SBATCH --partition=cpucourt

export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK

MPI jobs

See this page.

Python jobs

See this page to create your Miniconda environment.

GPU

For jobs using the GPU node, you must add the two options below, gres being the number of GPU cards to reserve per node (between 1 and 4).

#SBATCH --gres=gpu:1
#SBATCH --partition=gpu

Important: in some codes using the cuda:0 notation you must replace this notation by cuda:$CUDA_VISIBLE_DEVICES to make your job run on the assigned graphics card(s).

If you want to specify the GPU card type (A100 or V100), see this page.

To define the number of usable cores per GPU card, you can add the following option:

#SBATCH --cpus-per-gpu=

To use more than 4 GPUs in the same job, add the nodes and tasks-per-node options. Here’s a job using 6 GPUs (3 GPUs on 2 nodes):

#!/bin/bash
#SBATCH --job-name=gpujob
#SBATCH --time=09:00:00
#SBATCH --ntasks=2
#SBATCH --nodes=2
#SBATCH --tasks-per-node=1
#SBATCH --account=projectname
#SBATCH --partition=gpu
#SBATCH --gres=gpu:3

Interactive mode

The interactive mode is an alternative to the sbatch mode.

This allows you to launch an interactive shell on a compute node so that you can directly work on that node. It is especially well suited for the following situations:

  • you want to launch short duration executions
  • you want to compile a code
  • you want to untar a file
  • your software requires user interaction

This mode is perfect for server executions like Jupyter.

To launch an interactive bash shell for one hour on the cpucourt partition:

srun -A my_account_Slurm -p cpucourt -t 01:00:00 --pty bash -i

In this example, there is no -N or -n option, thus Slurm reserves only one core.

Jupyter

See this page.

Visualization node

See this page.

Exclusive mode

By default, Azzurra uses the Slurm shared mode: you reserve a certain number of cores on one or more nodes. Other jobs than yours can run on the remaining cores of this node(s).

When using the exclusive mode, you reserve all the cores on one or several nodes. No other job will run at the same time as yours on these node(s). However, even if your job does not use all the reserved cores, Slurm considers your job has consumed the elapsed time of the job multiplied by the total amount of reserved cores.

Add this option to use the exclusive mode:

#SBATCH --exclusive (in a batch)
or
--exclusive (command line)

Dependencies between jobs

If you need to submit a series of jobs that need to execute one after another, you can include dependencies.

In the example below, an initial job is sent. When it’s completed, a new job will start. This is an example with 5 dependencies. You can adjust it depending on your needs. Replace code.sh by the code you want to launch et adjust the slurm options (account, partition, etc.).

#!/bin/sh
 id=$(sbatch --parsable --account=your_account --partition=cpucourt --job-name=job-initial --ntasks=1 --output=outjob.txt code.sh)
 echo "job 1 has jobid $id"
 for n in {1..5}; do
     id=$(sbatch --parsable --account=your_account --partition=cpucourt --depend=afterany:$id --job-name=iteration-$n --ntasks=1 --output=output-iter-$n.slurmout code.sh);
     echo "job $n has jobid $id"
 done