Slurm¶
A YAML cluster configuration file for a Slurm resource manager on an HPC cluster looks like:
Warning
Open OnDemand’s Slurm support defaults to issuing CLI commands with
the --export
flag set to NONE
, when Slurms default is ALL
.
This can cause issues with jobs that require srun
.
Work arounds are currently to export SLURM_EXPORT_ENV=ALL
in a script_wrapper before any job scripts run.
Alternatively, you can use copy_environment
below with the caveat
that the PUNs environment is very different from regular shell sessions.
# /etc/ood/config/clusters.d/my_cluster.yml
---
v2:
metadata:
title: "My Cluster"
login:
host: "my_cluster.my_center.edu"
job:
adapter: "slurm"
cluster: "my_cluster"
bin: "/path/to/slurm/bin"
conf: "/path/to/slurm.conf"
# bin_overrides:
# sbatch: "/usr/local/bin/sbatch"
# squeue: ""
# scontrol: ""
# scancel: ""
copy_environment: false
with the following configuration options:
- adapter
This is set to
slurm
.- cluster
The Slurm cluster name. Optional, passed to SLURM as
-M <cluster>
Warning
Using the
cluster
option is discouraged. This is because maintenance outages on the Slurm database will propogate to Open OnDemand. Instead sites should use differentconf
files for each cluster to limit maintenance outages.- bin
The path to the Slurm client installation binaries.
- conf
The path to the Slurm configuration file for this cluster. Optional
- submit_host
A different, optional host to ssh to and then issue commands. Optional
- bin_overrides
Replacements/wrappers for Slurm’s job submission and control clients. Optional
Supports the following clients:
sbatch
squeue
scontrol
scancel
- copy_environment
Copies the enviornment of the PUN when issuing CLI commands. Default behaviour for Open OnDemand is to use
--export=NONE
flag. Setting this to true will cause Open OnDemand to issue CLI commands with--export=ALL
. Though this may cause issues as the PUN’s environment is very different than a regular shell session.
Note
If you do not have a multi-cluster Slurm setup you can remove the cluster:
"my_cluster"
line from the above configuration file.
Tip
When installing Slurm ensure that all nodes on your cluster including the node running the Open OnDemand server have the same MUNGE key installed. Read the Slurm Quick Start Administrator Guide for more information on installing and configuring Slurm itself.