Sun Grid Engine (SGE) and SLURM job scheduler concepts are quite similar. Below is a table of some common SGE commands and their SLURM equivalents.
Some common commands and flags in SGE and SLURM with their respective equivalents:
User Commands | SGE | SLURM |
Interactive login | qlogin |
srun --pty bash or srun -p "partition name" --time=4:0:0 --pty bash. For a quick dev node, just run "sdev" |
Job submission | qsub [script_file] | sbatch [script_file] |
Job deletion | qdel [job_id] | scancel [job_id] |
Job status by job | qstat -u \* [-j job_id] | squeue [job_id] |
Job status by user | qstat [-u user_name] |
squeue -u [user_name] |
Job hold | qhold [job_id] | scontrol hold [job_id] |
Job release | qrls [job_id] | scontrol release [job_id] |
Queue list | qconf -sql | squeue |
List nodes | qhost | sinfo -N OR scontrol show nodes |
Cluster status | qhost -q | sinfo |
GUI | qmon | sview |
Environmental | ||
Job ID | $JOB_ID | $SLURM_JOBID |
Submit directory | $SGE_O_WORKDIR | $SLURM_SUBMIT_DIR |
Submit host | $SGE_O_HOST | $SLURM_SUBMIT_HOST |
Node list | $PE_HOSTFILE | $SLURM_JOB_NODELIST |
Job Array Index | $SGE_TASK_ID | $SLURM_ARRAY_TASK_ID |
Job Specification | ||
Script directive | #$ | #SBATCH |
queue | -q [queue] | -p [queue] |
count of nodes | N/A | -N [min[-max]] |
CPU count | -pe [PE] [count] | -n [count] |
Wall clock limit | -l h_rt=[seconds] | -t [min] OR -t [days-hh:mm:ss] |
Standard out file | -o [file_name] | -o [file_name] |
Standard error file | -e [file_name] | e [file_name] |
Combine STDOUT & STDERR files | -j yes | (use -o without -e) |
Copy environment | -V | –export=[ALL | NONE | variables] |
Event notification | -m abe | –mail-type=[events] |
send notification email | -M [address] | –mail-user=[address] |
Job name | -N [name] | –job-name=[name] |
Restart job | -r [yes|no] | –requeue OR –no-requeue (NOTE: configurable default) |
Set working directory | -wd [directory] | –workdir=[dir_name] |
Resource sharing | -l exclusive | –exclusive OR–shared |
Memory size | -l mem_free=[memory][K|M|G] | –mem=[mem][M|G|T] OR –mem-per-cpu= [mem][M|G|T] |
Charge to an account | -A [account] | –account=[account] |
Tasks per node | (Fixed allocation_rule in PE) | –tasks-per-node=[count] |
–cpus-per-task=[count] | ||
Job dependancy | -hold_jid [job_id | job_name] | –depend=[state:job_id] |
Job project | -P [name] | –wckey=[name] |
Job host preference | -q [queue]@[node] OR -q [queue]@@[hostgroup] |
–nodelist=[nodes] AND/OR –exclude= [nodes] |
Quality of service | –qos=[name] | |
Job arrays | -t [array_spec] | –array=[array_spec] (Slurm version 2.6+) |
Generic Resources | -l [resource]=[value] | –gres=[resource_spec] |
Lincenses | -l [license]=[count] | –licenses=[license_spec] |
Begin Time | -a [YYMMDDhhmm] | –begin=YYYY-MM-DD[THH:MM[:SS]] |
SGE | SLURM |
---|---|
qstatqstat -u username |
squeuesqueue -u username |
qsubqsub -N jobname |
sbatchsbatch -J jobname sbatch –mem=4000 |
# Interactive run, one core | # Interactive run, one core |
qrsh -l h_rt=8:00:00 | salloc -t 8:00:00 interactive -p core -n 1 -t 8:00:00 |
qdel |
scancel |
SGE for a single-core application | SLURM for a single-core application |
---|---|
#!/bin/bash # # #$ -N test #$ -j y #$ -o test.output #$ -cwd #$ -M $USER@stanford.edu #$ -m bea # Request 5 hours run time #$ -l h_rt=5:0:0 #$ -P your_project_id_here # #$ -l mem=4G # <call your app here> |
#!/bin/bash -l # NOTE the -l flag! # #SBATCH -J test #SBATCH -o test."%j".out #SBATCH -e test."%j".err # Default in slurm #SBATCH --mail-user $USER@stanford.edu #SBATCH --mail-type=ALL # Request 5 hours run time #SBATCH -t 5:0:0 #SBATCH --mem=4000 #SBATCH -p normal <load modules, call your app here> |
Comparison of some parallel environments set by sge and slurm
SGE | SLURM |
---|---|
$JOB_ID | $SLURM_JOB_ID |
$NSLOTS | $SLURM_NPROCS |