Add a Jupyter App on a Kubernetes Cluster that behaves like HPC compute
This tutorial will walk you through creating an interactive Jupyter app that your users will use to launch a Jupyter Notebook Server in a kubernetes cluster. The container will behave much like a HPC compute node. This has the benefit that a single app can serve both traditional HPC as well as Kubernetes.
It assumes you have a working understanding of app development already. The purpose of this document is to describe how to write apps specifically for a kubernetes cluster, so it skips a lot of important details about app development that may be found in other tutorials like Add a Jupyter App.
We’re going to be looking at the bc osc jupyter app which is OSC’s production Jupyter app. You can fork, clone and modify for your site. This page also holds the submit yml in full for reference.
Refer to the interactive K8s Jupyter app for
additional details on items defined in submit.yml.erb
as well as a more traditional container approach.
The container
The container to make Kubernetes pods look like HPC compute would end up needing all the OS packages present on the HPC environment. The OS of the container itself would also need to compatible with packages installed on HPC environment so if you run RHEL 8 on HPC you would also need to run RHEL 8 inside the container.
Things like Lmod and HPC applications will need to be run inside the Pod’s container just like if a job was spawned in a traditional HPC resource manager.
Switch between SLURM and Kubernetes
The first big change from a traditional HPC interactive app is the main YAML structure is wrapped
in a large if
statement based on the cluster choice. If a user chooses one of the HPC clusters,
the SLURM submit YAML is rendered, otherwise the Kubernetes YAML is rendered. In the following examples the
SLURM clusters are named owens
and pitzer
and the Kubernetes cluster is named kubernetes
.
Here is the beginning of the block:
<% if cluster =~ /owens|pitzer/ -%>
---
batch_connect:
template: "basic"
conn_params:
- jupyter_api
script:
...SLURM specific...
Here is the logic to select Kubernetes:
<% elsif cluster =~ /kubernetes/
...Ruby variables setup here...
container spec
Let’s look at this section first of the Kubernetes block. Here you must specify the name
, image
and command
. The name determines the Pod Id (the job name in HPC parlance).
The image
should be the HPC container image and command
will be the job script that has been adapted to
work with both SLURM and Kubernetes. The command
will be run from the user’s home directory and will cover mount
requirements in mount requirements.
Warning
These examples use images from the Ohio SuperComputer Center’s private registry. They will not work at your site as this registry requires authentication.
One important aspect of the command
is that the job script executed is built using the standard before.sh
, script.sh
and after.sh
that one would normally use to build the job script for interactive apps running
on HPC resources. The way this pod is being setup, the same job script that runs on SLURM would also be used to
launch the container in Kubernetes.
Next you can specify additional environment variables in env
.
native:
container:
name: "jupyter"
image: "docker-registry.osc.edu/ondemand/ondemand-base-rhel7:0.3.1"
image_pull_policy: "IfNotPresent"
command: ["/bin/bash","-l","<%= staged_root %>/job_script_content.sh"]
restart_policy: 'OnFailure'
mounts
For a pod to look like an HPC environment the home directory of the user and any shared filesystems would need to be mounted on Kubernetes worker nodes and then made available to the pod.
In the example a Ruby structure is created to streamline some of the direct mounts where the path outside the container is the same as the path inside the container:
mounts = {
'home' => OodSupport::User.new.home,
'support' => OodSupport::User.new('support').home,
'project' => '/fs/project',
'scratch' => '/fs/scratch',
'ess' => '/fs/ess',
}
These mounts are defined in the YAML using a loop:
mounts:
<%- mounts.each_pair do |name, mount| -%>
- type: host
name: <%= name %>
host_type: Directory
path: <%= mount %>
destination_path: <%= mount %>
<%- end -%>
Additional mounts are needed to make the pod behave like a HPC compute node. Following are mounted into the container:
MUNGE socket so SLURM commands inside the pod can work
SLURM configuration so SLURM commands inside the pod know about scheduler host
SSSD pipes and configuration as well as nsswitch.conf so ID lookups inside the pod will work
Lmod initialization script
Lmod HPC applications
- type: host
name: munge-socket
host_type: Socket
path: /var/run/munge/munge.socket.2
destination_path: /var/run/munge/munge.socket.2
- type: host
name: slurm-conf
host_type: Directory
path: /etc/slurm
destination_path: /etc/slurm
- type: host
name: sssd-pipes
host_type: Directory
path: /var/lib/sss/pipes
destination_path: /var/lib/sss/pipes
- type: host
name: sssd-conf
host_type: Directory
path: /etc/sssd
destination_path: /etc/sssd
- type: host
name: nsswitch
host_type: File
path: /etc/nsswitch.conf
destination_path: /etc/nsswitch.conf
- type: host
name: lmod-init
host_type: File
path: /apps/<%= compute_cluster %>/lmod/lmod.sh
destination_path: /etc/profile.d/lmod.sh
- type: host
name: intel
host_type: Directory
path: /nfsroot/<%= compute_cluster %>/opt/intel
destination_path: /opt/intel
- type: host
name: apps
host_type: Directory
path: /apps/<%= compute_cluster %>
destination_path: <%= apps_path %>
submit yml in full
# submit.yml.erb
<%-
cores = num_cores.to_i
if cores == 0 && cluster == "pitzer"
# little optimization for pitzer nodes. They want the whole node, if they chose 'any',
# it can be scheduled on p18 or p20 nodes. If not, they'll get the constraint below.
base_slurm_args = ["--nodes", "1", "--exclusive"]
elsif cores == 0
# full node on owens
cores = 28
base_slurm_args = ["--nodes", "1", "--ntasks-per-node", "28"]
else
base_slurm_args = ["--nodes", "1", "--ntasks-per-node", "#{cores}"]
end
slurm_args = case node_type
when "gpu-40core"
base_slurm_args + ["--constraint", "40core"]
when "gpu-48core"
base_slurm_args + ["--constraint", "48core"]
when "any-40core"
base_slurm_args + ["--constraint", "40core"]
when "any-48core"
base_slurm_args + ["--constraint", "48core"]
when "hugemem"
base_slurm_args + ["--partition", "hugemem", "--exclusive"]
when "largemem"
base_slurm_args + ["--partition", "largemem", "--exclusive"]
when "debug"
base_slurm_args += ["--partition", "debug", "--exclusive"]
else
base_slurm_args
end
-%>
<% if cluster =~ /owens|pitzer/ -%>
---
batch_connect:
template: "basic"
conn_params:
- jupyter_api
script:
accounting_id: "<%= account %>"
<% if node_type =~ /gpu/ -%>
gpus_per_node: 1
<% end -%>
native:
<%- slurm_args.each do |arg| %>
- "<%= arg %>"
<%- end %>
<% elsif cluster =~ /kubernetes/
if node_type =~ /owens/
compute_cluster = "owens"
apps_path = "/usr/local"
# Memory per core with hyperthreading enabled
memory_mb = num_cores.to_i * 2200
elsif node_type =~ /pitzer/
compute_cluster = "pitzer"
apps_path = "/apps"
# Memory per core with hyperthreading enabled
memory_mb = num_cores.to_i * 4000
end
mounts = {
'home' => OodSupport::User.new.home,
'support' => OodSupport::User.new('support').home,
'project' => '/fs/project',
'scratch' => '/fs/scratch',
'ess' => '/fs/ess',
}
-%>
---
script:
accounting_id: "<%= account %>"
wall_time: "<%= bc_num_hours.to_i * 3600 %>"
<%- if node_type =~ /gpu/ -%>
gpus_per_node: 1
<%- end -%>
native:
container:
name: "jupyter"
image: "docker-registry.osc.edu/ondemand/ondemand-base-rhel7:0.3.1"
image_pull_policy: "IfNotPresent"
command: ["/bin/bash","-l","<%= staged_root %>/job_script_content.sh"]
restart_policy: 'OnFailure'
env:
NB_UID: "<%= Etc.getpwnam(ENV['USER']).uid %>"
NB_USER: "<%= ENV['USER'] %>"
NB_GID: "<%= Etc.getpwnam(ENV['USER']).gid %>"
CLUSTER: "<%= compute_cluster %>"
KUBECONFIG: "/dev/null"
labels:
osc.edu/cluster: "<%= compute_cluster %>"
port: "8080"
cpu: "<%= num_cores %>"
memory: "<%= memory_mb %>Mi"
mounts:
<%- mounts.each_pair do |name, mount| -%>
- type: host
name: <%= name %>
host_type: Directory
path: <%= mount %>
destination_path: <%= mount %>
<%- end -%>
- type: host
name: munge-socket
host_type: Socket
path: /var/run/munge/munge.socket.2
destination_path: /var/run/munge/munge.socket.2
- type: host
name: slurm-conf
host_type: Directory
path: /etc/slurm
destination_path: /etc/slurm
- type: host
name: sssd-pipes
host_type: Directory
path: /var/lib/sss/pipes
destination_path: /var/lib/sss/pipes
- type: host
name: sssd-conf
host_type: Directory
path: /etc/sssd
destination_path: /etc/sssd
- type: host
name: nsswitch
host_type: File
path: /etc/nsswitch.conf
destination_path: /etc/nsswitch.conf
- type: host
name: lmod-init
host_type: File
path: /apps/<%= compute_cluster %>/lmod/lmod.sh
destination_path: /etc/profile.d/lmod.sh
- type: host
name: apps
host_type: Directory
path: /apps/<%= compute_cluster %>
destination_path: <%= apps_path %>
node_selector:
osc.edu/role: ondemand
<% end -%>