A YAML cluster configuration file for a Slurm resource manager on an HPC cluster looks like:
Open OnDemand’s Slurm support defaults to issuing CLI commands with
--export flag set to
NONE, when Slurms default is
This can cause issues with jobs that require
Work arounds are currently to
in a script_wrapper before any job scripts run.
Alternatively, you can use
copy_enviornment below with the caveat
that the PUNs environment is very different from regular shell sessions.
# /etc/ood/config/clusters.d/my_cluster.yml --- v2: metadata: title: "My Cluster" login: host: "my_cluster.my_center.edu" job: adapter: "slurm" cluster: "my_cluster" bin: "/path/to/slurm/bin" conf: "/path/to/slurm.conf" # bin_overrides: # sbatch: "/usr/local/bin/sbatch" # squeue: "" # scontrol: "" # scancel: "" copy_enviornment: false
with the following configuration options:
This is set to
The Slurm cluster name. Optional, passed to SLURM as
clusteroption is discouraged. This is because maintenance outages on the Slurm database will propogate to Open OnDemand. Instead sites should use different
conffiles for each cluster to limit maintenance outages.
The path to the Slurm client installation binaries.
The path to the Slurm configuration file for this cluster. Optional
A different, optional host to ssh to and then issue commands. Optional
Replacements/wrappers for Slurm’s job submission and control clients. Optional
Supports the following clients:
Copies the enviornment of the PUN when issuing CLI commands. Default behaviour for Open OnDemand is to use
--export=NONEflag. Setting this to true will cause Open OnDemand to issue CLI commands with
--export=ALL. Though this may cause issues as the PUN’s environment is very different than a regular shell session.
If you do not have a multi-cluster Slurm setup you can remove the
"my_cluster" line from the above configuration file.