All HPC Wales compute systems run a single software stack in which the job scheduler is SchedMD’s Simple Linux Utility for Resource Management (Slurm).
Slurm is a scalable, resilient, feature full, customisable and open source professional package that is used on many of the world’s most powerful supercomputers. Using Slurm is similar to using LSF and other job schedulers. The user provides a job (batch) script which is submitted to Slurm. Slurm then schedules the job to run on the partition (called a queue in LSF) specified and this maps to specific hardware. For HPC Wales, Slurm provides a number of capabilities in terms of node monitoring, memory allocation and job control & accounting that benefit both users and system administrators alike.
For long-term users of HPC Wales the migration from the old LSF-based stack requires some re-work of job scripts. In the large majority of cases this re-work is very simple. In a few cases it will be more complex. Software remains the same and is managed using the same module commands. We provide here a range of resources aimed at easing the migration, and of course we will be on hand to provide assistance to anybody requiring support in the process.
- Slurm: Submitting, Monitoring and Killing Jobs – general use of the Slurm environment
- More On Slurm Jobs – including Slurm specific and LSF-to-Slurm migration information
- LSF to Slurm: Quick Reference – a very quick cheat sheet
- Slurm Package Documentation