Start a GPU Slurm session
“gpudebug” partition is for debugging GPU applications only. A “gpudebug” command is available for starting a Slurm session.
The following command (email address is optional) will get you a Slurm session with 12 CPU cores, 12GB CPU memory and 1 x GPU card (Nvidia T4):
To start a GUI “gpu” Slurm session with a GPU card and 20 CPU cores, and receiving email notifications from Slurm, for example, you can use the following command:
The following command (email address is optional) will get you a Slurm session with 12 CPU cores, 12GB CPU memory and 1 x GPU card (Nvidia T4):
gpudbug [YOUR-EMAIL-ADDRESS]“gpu” Slurm partition can be used with or without allocating a GPU card.
To start a GUI “gpu” Slurm session with a GPU card and 20 CPU cores, and receiving email notifications from Slurm, for example, you can use the following command:
srun -p gpu --gpus=1 -c 20 --mail-type=ALL --mail-user=YOUR-EMAIL-ADDRESS --x11 --pty bash -iTo start a non-GUI “gpu” Slurm session with 20 CPU cores only, for example, you can use the following command:
srun -p gpu -c 20 --pty bash -iThe following sbatch script example asks for 1 GPU, 20 CPU cores and 40Gb RAM:
#!/bin/bash #SBATCH --job-name="slurmGpuTest" #SBATCH --partition=gpu #SBATCH --nodes=1 #SBATCH --ntasks=1 #SBATCH --cpus-per-task=20 #SBATCH --mem=40000 #SBATCH --gpus=1 #SBATCH --output=slurmGpuTest.out #SBATCH --mail-user=YOUR-EMAIL-ADDRESS #SBATCH --mail-type=ALL ##SBATCH --time=00:15:00 ##SBATCH --requeue #Specifies that the job will be requeued after a node failure.The default is that the job will not be requeued. ##SBATCH --checkpoint=1:0:0 hostname date +'%y-%m-%d %H:%M:%S' which python python --version module load python which python python --version module unload python which python python --version date +'%y-%m-%d %H:%M:%S' sleep 100 date +'%y-%m-%d %H:%M:%S'
Recent Comments