Class: OodCore::Job::Adapters::Torque::Batch

Inherits:
Object
  • Object
show all
Defined in:
lib/ood_core/job/adapters/torque/batch.rb

Overview

Object used for simplified communication with a batch server

Defined Under Namespace

Classes: Error

Instance Attribute Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(host:, submit_host: "", strict_host_checking: true, lib: "", bin: "", bin_overrides: {}, **_) ⇒ Batch

Returns a new instance of Batch.

Parameters:

  • host (#to_s)

    the batch server host

  • submit_host (#to_s) (defaults to: "")

    the login node

  • strict_host_checking (bool) (defaults to: true)

    use strict host checking when ssh to submit_host

  • lib (#to_s) (defaults to: "")

    path to FFI installation libraries

  • bin (#to_s) (defaults to: "")

    path to FFI installation binaries



53
54
55
56
57
58
59
60
# File 'lib/ood_core/job/adapters/torque/batch.rb', line 53

def initialize(host:, submit_host: "", strict_host_checking: true, lib: "", bin: "", bin_overrides: {}, **_)
  @host                 = host.to_s
  @submit_host          = submit_host.to_s
  @strict_host_checking = strict_host_checking
  @lib                  = Pathname.new(lib.to_s)
  @bin                  = Pathname.new(bin.to_s)
  @bin_overrides        = bin_overrides
end

Instance Attribute Details

#binPathname (readonly)

The path to the Torque client installation binaries

Examples:

For Torque 5.0.0

my_conn.bin.to_s #=> "/usr/local/Torque/5.0.0/bin"

Returns:

  • (Pathname)

    path to Torque binaries



36
37
38
# File 'lib/ood_core/job/adapters/torque/batch.rb', line 36

def bin
  @bin
end

#bin_overridesObject (readonly)

Optional overrides for Torque client executables

Examples:

{'qsub' => '/usr/local/bin/qsub'}


42
43
44
# File 'lib/ood_core/job/adapters/torque/batch.rb', line 42

def bin_overrides
  @bin_overrides
end

#hostString (readonly)

The host of the Torque batch server

Examples:

OSC's Oakley batch server

my_conn.host #=> "oak-batch.osc.edu"

Returns:

  • (String)

    the batch server host



12
13
14
# File 'lib/ood_core/job/adapters/torque/batch.rb', line 12

def host
  @host
end

#libPathname (readonly)

The path to the Torque client installation libraries

Examples:

For Torque 5.0.0

my_conn.lib.to_s #=> "/usr/local/Torque/5.0.0/lib"

Returns:

  • (Pathname)

    path to Torque libraries



30
31
32
# File 'lib/ood_core/job/adapters/torque/batch.rb', line 30

def lib
  @lib
end

#strict_host_checkingBool (readonly)

Determines whether to use strict_host_checking for ssh

Examples:

my_conn.strict_host_checking.to_s #=> "owens.osc.edu"

Returns:

  • (Bool)


24
25
26
# File 'lib/ood_core/job/adapters/torque/batch.rb', line 24

def strict_host_checking
  @strict_host_checking
end

#submit_hostString (readonly)

The login node where job is submitted via ssh

Examples:

OSC's owens login node

my_conn.submit_host #=> "owens.osc.edu"

Returns:

  • (String)

    the login node



18
19
20
# File 'lib/ood_core/job/adapters/torque/batch.rb', line 18

def submit_host
  @submit_host
end

Instance Method Details

#==(other) ⇒ Boolean

The comparison operator

Parameters:

  • other (#to_h)

    batch server to compare against

Returns:

  • (Boolean)

    how batch servers compare



71
72
73
# File 'lib/ood_core/job/adapters/torque/batch.rb', line 71

def ==(other)
  to_h == other.to_h
end

#connect {|cid| ... } ⇒ Object

Creates a connection to batch server and calls block in context of this connection

Yield Parameters:

  • cid (Fixnum)

    connection id from established batch server connection

Yield Returns:

  • the final value of the block



93
94
95
96
97
98
99
100
101
102
103
104
# File 'lib/ood_core/job/adapters/torque/batch.rb', line 93

def connect(&block)
  FFI.lib = lib.join('libtorque.so')
  cid = FFI.pbs_connect(host)
  FFI.raise_error(cid.abs) if cid < 0  # raise error if negative connection id
  begin
    value = yield cid
  ensure
    FFI.pbs_disconnect(cid)            # always close connection
  end
  FFI.check_for_error                  # check for errors at end
  value
end

#delete_job(id) ⇒ void

This method returns an undefined value.

Delete a specified job from batch server

Examples:

Delete job '10219837.oak-batch.osc.edu' from batch

my_conn.delete_job('10219837.oak-batch.osc.edu')

Parameters:

  • id (#to_s)

    the id of the job



322
323
324
325
326
# File 'lib/ood_core/job/adapters/torque/batch.rb', line 322

def delete_job(id)
  connect do |cid|
    FFI.pbs_deljob cid, id.to_s, nil
  end
end

#eql?(other) ⇒ Boolean

Checks whether two batch server objects are completely identical to each other

Parameters:

  • other (Batch)

    batch server to compare against

Returns:

  • (Boolean)

    whether same objects



79
80
81
# File 'lib/ood_core/job/adapters/torque/batch.rb', line 79

def eql?(other)
  self.class == other.class && self == other
end

#get_job(id, **kwargs) ⇒ Hash

Get info for given batch server's job

Examples:

Status info for OSC Oakley's '10219837.oak-batch.osc.edu' job

my_conn.get_job('102719837.oak-batch.osc.edu')
#=>
#{
#  "10219837.oak-batch.osc.edu" => {
#    :Job_Owner => "bob@oakley02.osc.edu",
#    :Job_Name => "CFD_Solver",
#    ...
#  }
#}

Parameters:

  • id (#to_s)

    the id of requested information

  • filters (Array<Symbol>)

    list of attribs to filter on

Returns:

  • (Hash)

    hash with details of job



281
282
283
# File 'lib/ood_core/job/adapters/torque/batch.rb', line 281

def get_job(id, **kwargs)
  get_jobs(id: id, **kwargs)
end

#get_jobs(id: '', filters: []) ⇒ Hash

Get a list of hashes of the jobs on the batch server

Examples:

Status info for OSC Oakley jobs

my_conn.get_jobs
#=>
#{
#  "10219837.oak-batch.osc.edu" => {
#    :Job_Owner => "bob@oakley02.osc.edu",
#    :Job_Name => "CFD_Solver",
#    ...
#  },
#  "10219838.oak-batch.osc.edu" => {
#    :Job_Owner => "sally@oakley01.osc.edu",
#    :Job_Name => "FEA_Solver",
#    ...
#  },
#  ...
#}

Parameters:

  • id (#to_s) (defaults to: '')

    the id of requested information

  • filters (Array<Symbol>) (defaults to: [])

    list of attribs to filter on

Returns:

  • (Hash)

    hash of details for jobs



260
261
262
263
264
265
266
# File 'lib/ood_core/job/adapters/torque/batch.rb', line 260

def get_jobs(id: '', filters: [])
  connect do |cid|
    filters = FFI::Attrl.from_list(filters)
    batch_status = FFI.pbs_statjob cid, id.to_s, filters, nil
    batch_status.to_h.tap { FFI.pbs_statfree batch_status }
  end
end

#get_node(id, **kwargs) ⇒ Hash

Get info for given batch server's node

Examples:

Status info for OSC Oakley's 'n0001' node

my_conn.get_node('n0001')
#=>
#{
#  "n0001" => {
#    :np => "12",
#    ...
#  }
#}

Parameters:

  • id (#to_s)

    the id of requested information

  • filters (Array<Symbol>)

    list of attribs to filter on

Returns:

  • (Hash)

    status info for the node



207
208
209
# File 'lib/ood_core/job/adapters/torque/batch.rb', line 207

def get_node(id, **kwargs)
  get_nodes(id: id, **kwargs)
end

#get_nodes(id: '', filters: []) ⇒ Hash

Get a list of hashes of the nodes on the batch server

Examples:

Status info for OSC Oakley nodes

my_conn.get_nodes
#=>
#{
#  "n0001" => {
#    :np => "12",
#    ...
#  },
#  "n0002" => {
#    :np => "12",
#    ...
#  },
#  ...
#}

Parameters:

  • id (#to_s) (defaults to: '')

    the id of requested information

  • filters (Array<Symbol>) (defaults to: [])

    list of attribs to filter on

Returns:

  • (Hash)

    hash of details for nodes



187
188
189
190
191
192
193
# File 'lib/ood_core/job/adapters/torque/batch.rb', line 187

def get_nodes(id: '', filters: [])
  connect do |cid|
    filters = FFI::Attrl.from_list(filters)
    batch_status = FFI.pbs_statnode cid, id.to_s, filters, nil
    batch_status.to_h.tap { FFI.pbs_statfree batch_status }
  end
end

#get_queue(id, **kwargs) ⇒ Hash

Get info for given batch server's queue

Examples:

Status info for OSC Oakley's parallel queue

my_conn.get_queue("parallel")
#=>
#{
#  "parallel" => {
#    :queue_type => "Execution",
#    ...
#  }
#}

Returns:

  • (Hash)

    status info for the queue



164
165
166
# File 'lib/ood_core/job/adapters/torque/batch.rb', line 164

def get_queue(id, **kwargs)
  get_queues(id: id, **kwargs)
end

#get_queues(id: '', filters: []) ⇒ Hash

Get a list of hashes of the queues on the batch server

Examples:

Status info for OSC Oakley queues

my_conn.get_queues
#=>
#{
#  "parallel" => {
#    :queue_type => "Execution",
#    ...
#  },
#  "serial" => {
#    :queue_type => "Execution",
#    ...
#  },
#  ...
#}

Parameters:

  • id (#to_s) (defaults to: '')

    the id of requested information

  • filters (Array<Symbol>) (defaults to: [])

    list of attribs to filter on

Returns:

  • (Hash)

    hash of details for the queues



144
145
146
147
148
149
150
# File 'lib/ood_core/job/adapters/torque/batch.rb', line 144

def get_queues(id: '', filters: [])
  connect do |cid|
    filters = FFI::Attrl.from_list(filters)
    batch_status = FFI.pbs_statque cid, id.to_s, filters, nil
    batch_status.to_h.tap { FFI.pbs_statfree batch_status }
  end
end

#get_status(filters: []) ⇒ Hash

Get a hash with status info for this batch server

Examples:

Status info for OSC Oakley batch server

my_conn.get_status
#=>
#{
#  "oak-batch.osc.edu:15001" => {
#    :server_state => "Idle",
#    ...
#  }
#}

Parameters:

  • filters (Array<Symbol>) (defaults to: [])

    list of attribs to filter on

Returns:

  • (Hash)

    status info for batch server



118
119
120
121
122
123
124
# File 'lib/ood_core/job/adapters/torque/batch.rb', line 118

def get_status(filters: [])
  connect do |cid|
    filters = FFI::Attrl.from_list filters
    batch_status = FFI.pbs_statserver cid, filters, nil
    batch_status.to_h.tap { FFI.pbs_statfree batch_status }
  end
end

#hashFixnum

Generates a hash value for this object

Returns:

  • (Fixnum)

    hash value of object



85
86
87
# File 'lib/ood_core/job/adapters/torque/batch.rb', line 85

def hash
  [self.class, to_h].hash
end

#hold_job(id, type: :u) ⇒ void

This method returns an undefined value.

Put specified job on hold Possible hold types:

:u => Available to the owner of the job, the batch operator and the batch administrator
:o => Available to the batch operator and the batch administrator
:s => Available to the batch administrator

Examples:

Put job '10219837.oak-batch.osc.edu' on hold

my_conn.hold_job('10219837.oak-batch.osc.edu')

Parameters:

  • id (#to_s)

    the id of the job

  • type (#to_s) (defaults to: :u)

    type of hold to be applied



295
296
297
298
299
# File 'lib/ood_core/job/adapters/torque/batch.rb', line 295

def hold_job(id, type: :u)
  connect do |cid|
    FFI.pbs_holdjob cid, id.to_s, type.to_s, nil
  end
end

#release_job(id, type: :u) ⇒ void

This method returns an undefined value.

Release a specified job that is on hold Possible hold types:

:u => Available to the owner of the job, the batch operator and the batch administrator
:o => Available to the batch operator and the batch administrator
:s => Available to the batch administrator

Examples:

Release job '10219837.oak-batch.osc.edu' from hold

my_conn.release_job('10219837.oak-batch.osc.edu')

Parameters:

  • id (#to_s)

    the id of the job

  • type (#to_s) (defaults to: :u)

    type of hold to be removed



311
312
313
314
315
# File 'lib/ood_core/job/adapters/torque/batch.rb', line 311

def release_job(id, type: :u)
  connect do |cid|
    FFI.pbs_rlsjob cid, id.to_s, type.to_s, nil
  end
end

#select_jobs(attribs: []) ⇒ Hash

Get a list of hashes of the selected jobs on the batch server

Examples:

Status info for jobs owned by Bob

my_conn.select_jobs(attribs: [{name: "User_List", value: "bob", op: :eq}])
#=>
#{
#  "10219837.oak-batch.osc.edu" => {
#    :Job_Owner => "bob@oakley02.osc.edu",
#    :Job_Name => "CFD_Solver",
#    ...
#  },
#  "10219839.oak-batch.osc.edu" => {
#    :Job_Owner => "bob@oakley02.osc.edu",
#    :Job_Name => "CFD_Solver2",
#    ...
#  },
#  ...
#}

Parameters:

  • attribs (Array<#to_h>) (defaults to: [])

    list of hashes describing attributes to select on

Returns:

  • (Hash)

    hash of details of selected jobs



232
233
234
235
236
237
238
# File 'lib/ood_core/job/adapters/torque/batch.rb', line 232

def select_jobs(attribs: [])
  connect do |cid|
    attribs = FFI::Attropl.from_list(attribs.map(&:to_h))
    batch_status = FFI.pbs_selstat cid, attribs, nil
    batch_status.to_h.tap { FFI.pbs_statfree batch_status }
  end
end

#submit(content, args: [], env: {}, chdir: nil) ⇒ String

Submit a script expanded as a string to the batch server

Parameters:

  • content (#to_s)

    script as a string

  • args (Array<#to_s>) (defaults to: [])

    arguments passed to `qsub` command

  • env (Hash{#to_s => #to_s}) (defaults to: {})

    environment variables set

  • chdir (#to_s, nil) (defaults to: nil)

    working directory where `qsub` is called from

Returns:

  • (String)

    the id of the job that was created

Raises:

  • (Error)

    if `qsub` command exited unsuccessfully



376
377
378
# File 'lib/ood_core/job/adapters/torque/batch.rb', line 376

def submit(content, args: [], env: {}, chdir: nil)
  call(:qsub, *args, env: env, stdin: content, chdir: chdir).strip
end

#submit_script(script, queue: nil, headers: {}, resources: {}, envvars: {}, qsub: true) ⇒ String

Deprecated.

Use #submit instead.

Submit a script to the batch server

Examples:

Submit a script with a few PBS directives

my_conn.submit_script("/path/to/script",
  headers: {
    Job_Name: "myjob",
    Join_Path: "oe"
  },
  resources: {
    nodes: "4:ppn=12",
    walltime: "12:00:00"
  },
  envvars: {
    TOKEN: "asd90f9sd8g90hk34"
  }
)
#=> "6621251.oak-batch.osc.edu"

Parameters:

  • script (#to_s)

    path to the script

  • queue (#to_s) (defaults to: nil)

    queue to submit script to

  • headers (Hash) (defaults to: {})

    pbs headers

  • resources (Hash) (defaults to: {})

    pbs resources

  • envvars (Hash) (defaults to: {})

    pbs environment variables

  • qsub (Boolean) (defaults to: true)

    whether use library or binary for submission

Returns:

  • (String)

    the id of the job that was created



352
353
354
# File 'lib/ood_core/job/adapters/torque/batch.rb', line 352

def submit_script(script, queue: nil, headers: {}, resources: {}, envvars: {}, qsub: true)
  send(qsub ? :qsub_submit : :pbs_submit, script.to_s, queue.to_s, headers, resources, envvars)
end

#submit_string(string, **kwargs) ⇒ String

Deprecated.

Use #submit instead.

Submit a script expanded into a string to the batch server

Parameters:

  • string (#to_s)

    script as a string

  • script (#to_s)

    path to the script

  • queue (#to_s)

    queue to submit script to

  • headers (Hash)

    pbs headers

  • resources (Hash)

    pbs resources

  • envvars (Hash)

    pbs environment variables

  • qsub (Boolean)

    whether use library or binary for submission

Returns:

  • (String)

    the id of the job that was created



361
362
363
364
365
366
367
# File 'lib/ood_core/job/adapters/torque/batch.rb', line 361

def submit_string(string, **kwargs)
  Tempfile.open('qsub.') do |f|
    f.write string.to_s
    f.close
    submit_script(f.path, **kwargs)
  end
end

#to_hHash

Convert object to hash

Returns:

  • (Hash)

    the hash describing this object



64
65
66
# File 'lib/ood_core/job/adapters/torque/batch.rb', line 64

def to_h
  {host: host, submit_host: submit_host, strict_host_checking: strict_host_checking, lib: lib, bin: bin}
end