Class: OodCore::Job::Adapters::HTCondor::Batch Private
- Inherits:
-
Object
- Object
- OodCore::Job::Adapters::HTCondor::Batch
- Defined in:
- lib/ood_core/job/adapters/htcondor.rb
Overview
This class is part of a private API. You should avoid using this class if possible, as it may be removed or be changed in the future.
Object used for simplified communication with an HTCondor batch server
Defined Under Namespace
Classes: Error
Instance Attribute Summary collapse
-
#additional_attributes ⇒ Hash{#to_s => #to_s}
readonly
private
Additional attributes to be added to the job submission.
-
#bin ⇒ Pathname
readonly
private
The path to the HTCondor client installation binaries.
-
#bin_overrides ⇒ Pathname
readonly
private
The path to the HTCondor client installation binaries that override the default binaries.
-
#cluster ⇒ String
readonly
private
The cluster name for this HTCondor instance.
-
#default_docker_image ⇒ String
readonly
private
Default docker image for jobs submitted to HTCondor.
-
#default_universe ⇒ String
readonly
private
Default universe for jobs submitted to HTCondor.
-
#strict_host_checking ⇒ Bool
readonly
private
Whether to use strict host checking when ssh to submit_host.
-
#submit_host ⇒ String
readonly
private
The login node where the job is submitted via ssh.
-
#user_group_map ⇒ String?
readonly
private
A path to the user/group map for HTCondor jobs The format in the file should adhere to the format used by [AssignAccountingGroup](htcondor.readthedocs.io/en/latest/admin-manual/introduction-to-configuration.html#FEATURE:ASSIGNACCOUNTINGGROUP).
-
#version ⇒ Gem::Version
readonly
private
The version of HTCondor on the submit_host.
Instance Method Summary collapse
- #condor_q_attrs ⇒ Object private
-
#get_accounts ⇒ Hash{String => Array<String>}
private
Retrieve accounts using user_group_map on @submit_host.
-
#get_jobs(id: "", owner: nil) ⇒ Array<Hash>
private
Retrieve job information using `condor_q`.
-
#get_slots ⇒ Array<Hash>
private
Retrieve slot information using `condor_status`.
-
#hold_job(id) ⇒ Object
private
Place a job on hold using `condor_hold`.
-
#initialize(bin: nil, bin_overrides: {}, submit_host: "", strict_host_checking: false, default_universe: "vanilla", default_docker_image: "ubuntu:latest", user_group_map: nil, cluster: "", additional_attributes: {}) ⇒ Batch
constructor
private
A new instance of Batch.
-
#release_job(id) ⇒ Object
private
Release a job from hold using `condor_release`.
-
#remove_job(id) ⇒ Object
private
Run the `condor_rm` command to remove a job.
-
#submit_string(args: [], script_args: [], env: {}, script: "") ⇒ String
private
Submit a script to the batch server.
Constructor Details
#initialize(bin: nil, bin_overrides: {}, submit_host: "", strict_host_checking: false, default_universe: "vanilla", default_docker_image: "ubuntu:latest", user_group_map: nil, cluster: "", additional_attributes: {}) ⇒ Batch
This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.
Returns a new instance of Batch.
100 101 102 103 104 105 106 107 108 109 110 111 |
# File 'lib/ood_core/job/adapters/htcondor.rb', line 100 def initialize(bin: nil, bin_overrides: {}, submit_host: "", strict_host_checking: false, default_universe: "vanilla", default_docker_image: "ubuntu:latest", user_group_map: nil, cluster: "", additional_attributes: {}) @bin = Pathname.new(bin.to_s) @bin_overrides = bin_overrides @submit_host = submit_host.to_s @strict_host_checking = strict_host_checking @default_universe = default_universe.to_s @default_docker_image = default_docker_image.to_s @user_group_map = user_group_map.to_s unless user_group_map.nil? @cluster = cluster.to_s @additional_attributes = additional_attributes @version = get_htcondor_version end |
Instance Attribute Details
#additional_attributes ⇒ Hash{#to_s => #to_s} (readonly)
This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.
Additional attributes to be added to the job submission
87 88 89 |
# File 'lib/ood_core/job/adapters/htcondor.rb', line 87 def additional_attributes @additional_attributes end |
#bin ⇒ Pathname (readonly)
This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.
The path to the HTCondor client installation binaries
53 54 55 |
# File 'lib/ood_core/job/adapters/htcondor.rb', line 53 def bin @bin end |
#bin_overrides ⇒ Pathname (readonly)
This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.
The path to the HTCondor client installation binaries that override the default binaries
58 59 60 |
# File 'lib/ood_core/job/adapters/htcondor.rb', line 58 def bin_overrides @bin_overrides end |
#cluster ⇒ String (readonly)
This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.
The cluster name for this HTCondor instance
83 84 85 |
# File 'lib/ood_core/job/adapters/htcondor.rb', line 83 def cluster @cluster end |
#default_docker_image ⇒ String (readonly)
This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.
Default docker image for jobs submitted to HTCondor
74 75 76 |
# File 'lib/ood_core/job/adapters/htcondor.rb', line 74 def default_docker_image @default_docker_image end |
#default_universe ⇒ String (readonly)
This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.
Default universe for jobs submitted to HTCondor
70 71 72 |
# File 'lib/ood_core/job/adapters/htcondor.rb', line 70 def default_universe @default_universe end |
#strict_host_checking ⇒ Bool (readonly)
This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.
Whether to use strict host checking when ssh to submit_host
66 67 68 |
# File 'lib/ood_core/job/adapters/htcondor.rb', line 66 def strict_host_checking @strict_host_checking end |
#submit_host ⇒ String (readonly)
This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.
The login node where the job is submitted via ssh
62 63 64 |
# File 'lib/ood_core/job/adapters/htcondor.rb', line 62 def submit_host @submit_host end |
#user_group_map ⇒ String? (readonly)
This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.
A path to the user/group map for HTCondor jobs The format in the file should adhere to the format used by [AssignAccountingGroup](htcondor.readthedocs.io/en/latest/admin-manual/introduction-to-configuration.html#FEATURE:ASSIGNACCOUNTINGGROUP)
79 80 81 |
# File 'lib/ood_core/job/adapters/htcondor.rb', line 79 def user_group_map @user_group_map end |
#version ⇒ Gem::Version (readonly)
This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.
The version of HTCondor on the submit_host
91 92 93 |
# File 'lib/ood_core/job/adapters/htcondor.rb', line 91 def version @version end |
Instance Method Details
#condor_q_attrs ⇒ Object
This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.
164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 |
# File 'lib/ood_core/job/adapters/htcondor.rb', line 164 def condor_q_attrs { id: "ClusterId", sub_id: "ProcId", status: "JobStatus", owner: "Owner", acct_group: "AcctGroup", job_name: "JobBatchName", procs: "CpusProvisioned", gpus: "GpusProvisioned", submission_time: "QDate", dispatch_time: "JobCurrentStartDate", sys_cpu_time: "RemoteSysCpu", user_cpu_time: "RemoteUserCpu", wallclock_time: "RemoteWallClockTime" } end |
#get_accounts ⇒ Hash{String => Array<String>}
This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.
Retrieve accounts using user_group_map on @submit_host
219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 |
# File 'lib/ood_core/job/adapters/htcondor.rb', line 219 def get_accounts raise Error, "user_group_map is not defined" if user_group_map.nil? || user_group_map.empty? # Retrieve accounts, use local file, if exists. Otherwise use from submit_host if File.exist?(user_group_map) && File.readable?(user_group_map) output = File.read(user_group_map) else output = call("cat", user_group_map) end accounts = {} output.each_line do |line| next if line.strip.empty? || line.start_with?("#") # Skip empty lines and comments _, username, groups = line.strip.split(/\s+/, 3) accounts[username] = groups.split(",") if username && groups end accounts rescue Error => e raise Error, "Failed to retrieve accounts: #{e.}" end |
#get_jobs(id: "", owner: nil) ⇒ Array<Hash>
This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.
Retrieve job information using `condor_q`
187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 |
# File 'lib/ood_core/job/adapters/htcondor.rb', line 187 def get_jobs(id: "", owner: nil) args = [] unless id.to_s.empty? if id.to_s.include?(".") # if id is a job array, we need to use the ClusterId and ProcId cluster_id, proc_id = id.to_s.split(".") args.concat ["-constraint", "\"ClusterId == #{cluster_id} && ProcId == #{proc_id}\""] else # if id is a single job, we can just use the ClusterId args.concat ["-constraint", "\"ClusterId == #{id}\""] end end args.concat ["-constraint", "\"Owner == #{owner}\""] unless owner.to_s.empty? args.concat ["-af", *condor_q_attrs.values] output = call("condor_q", *args) parse_condor_q_output(output) end |
#get_slots ⇒ Array<Hash>
This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.
Retrieve slot information using `condor_status`
208 209 210 211 212 213 214 |
# File 'lib/ood_core/job/adapters/htcondor.rb', line 208 def get_slots args = ["-af", "Machine", "TotalSlotCPUs", "TotalSlotGPUs", "TotalSlotMemory", "CPUs", "GPUs", "Memory", "NumDynamicSlots"] args.concat ["-constraint", "\"DynamicSlot is undefined\""] output = call("condor_status", *args) parse_condor_status_output(output) end |
#hold_job(id) ⇒ Object
This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.
Place a job on hold using `condor_hold`
147 148 149 150 151 152 |
# File 'lib/ood_core/job/adapters/htcondor.rb', line 147 def hold_job(id) id = id.to_s call("condor_hold", id) rescue Error => e raise Error, "Failed to hold job #{id}: #{e.}" end |
#release_job(id) ⇒ Object
This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.
Release a job from hold using `condor_release`
157 158 159 160 161 162 |
# File 'lib/ood_core/job/adapters/htcondor.rb', line 157 def release_job(id) id = id.to_s call("condor_release", id) rescue Error => e raise Error, "Failed to release job #{id}: #{e.}" end |
#remove_job(id) ⇒ Object
This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.
Run the `condor_rm` command to remove a job
138 139 140 141 142 |
# File 'lib/ood_core/job/adapters/htcondor.rb', line 138 def remove_job(id) call("condor_rm", id.to_s) rescue Error => e raise Error, "Failed to remove job #{id}: #{e.}" end |
#submit_string(args: [], script_args: [], env: {}, script: "") ⇒ String
This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.
Submit a script to the batch server
119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 |
# File 'lib/ood_core/job/adapters/htcondor.rb', line 119 def submit_string(args: [], script_args: [], env: {}, script: "") args = args.map(&:to_s) script_args = script_args.map(&:to_s).map { |s| s.to_s.gsub('"', "'") } # cannot do double env = env.to_h.each_with_object({}) { |(k, v), h| h[k.to_s] = v.to_s } path = "#{Dir.tmpdir}/htcondor_submit_#{SecureRandom.uuid}" call("bash", "-c", "cat > #{path}", stdin: script) output = call("condor_submit", *args, env: env, stdin: "arguments=#{path.split("/").last} #{script_args.join(" ")}\ntransfer_input_files=#{path}").strip match = output.match(/(cluster )?(\d+)/) raise Error, "Failed to parse job ID from output: #{output}" unless match match[2] end |