A YAML cluster configuration file for a Cloudy Cluster resource manager on an HPC cluster looks like:
# /etc/ood/config/clusters.d/my_cluster.yml --- v2: metadata: title: "My Cluster" login: host: "my_cluster.my_center.edu" job: adapter: "ccq" image: "my-default-image" cloud: "gcp" scheduler: "my_scheduler" bin: "/path/to/other/CCQ" jobid_regex: "different job_id regex: (?<job_id>\\d+) " # bin_overrides: # ccqstat: "/usr/local/bin/ccqstat" # ccqdel: "" # ccqsub: ""
with the following configuration options:
This is set to
The default cloud image to use when launching jobs. There is no default.
The cloud provider being used. Valid options are
aws. Defaults to
The name of the scheduler being used. There is no default.
The path to the CCQ client installation binaries. Defaults to
The regular expression to extract the job id from the ccqstat output. Defaults to
job id is: (?<job_id>\\d+) you. You should only need this if the ccqstat output changes format. If you are required to reconfigure, you’ll need to extract the named group
job_idas the default does.
Replacements/wrappers for CCQ’s job submission and control clients. Optional
Supports the following clients:
Prompted for input¶
You may see this error when you initially try to start a job.
The /opt/CloudyCluster/srv/CCQ/ccqsub command was prompted. You need to generate the certificate manually in a shell by running 'ccqstat' and entering your username/password
This is because CCQ libraries require a certificate to be generated to communicate with the
backend servers. To remediate you’ll simply have to login through a shell terminal and generate
a certificate. Do this by running the
ccqstat command and entering your username and password
when prompted. If you’re successful, the command will generate a
ccqCert.cert in your home
directory that subsequent invocations will use.
Note these certificates expire, so you may have to generate them every so often or specify a very distant expiry date when you do generate them.