Start a GPU Slurm session

“gpudebug” partition is for debugging GPU applications only. A “gpudebug” command is available for starting a Slurm session.
The following command (email address is optional) will get you a Slurm session with 12 CPU cores, 12GB CPU memory and 1 x GPU card (Nvidia T4):
“gpu” Slurm partition can be used with or without allocating a GPU card.
To start a GUI “gpu” Slurm session with a GPU card and 20 CPU cores, and receiving email notifications from Slurm, for example, you can use the following command:
srun -p gpu --gpus=1 -c 20 --mail-type=ALL --mail-user=YOUR-EMAIL-ADDRESS --x11 --pty bash -i
To start a non-GUI “gpu” Slurm session with 20 CPU cores only, for example, you can use the following command:
srun -p gpu -c 20 --pty bash -i
The following sbatch script example asks for 1 GPU, 20 CPU cores and 40Gb RAM:
#SBATCH --job-name="slurmGpuTest"
#SBATCH --partition=gpu
#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=20
#SBATCH --mem=40000
#SBATCH --gpus=1
#SBATCH --output=slurmGpuTest.out
#SBATCH --mail-type=ALL

##SBATCH --time=00:15:00
##SBATCH --requeue      #Specifies that the job will be requeued after a node failure.The default is that the job will not be requeued.
##SBATCH --checkpoint=1:0:0

date +'%y-%m-%d %H:%M:%S'
which python
python --version
module load python
which python
python --version
module unload python
which python
python --version
date +'%y-%m-%d %H:%M:%S'
sleep 100
date +'%y-%m-%d %H:%M:%S'