Quick install
Download and extract the latest Sbox release.
To access JupyerLab sessions, install Anaconda and create required virtual environments and modulefiles. Review “Requirements” to learn more.
Update the
config
file based on the cluster information. Review “Configuration” to learn more.Place a modulefile for Sbox under
$MODULEPATH/sbox
directory and load the module or add the Sbox bin directory to$PATH
. A Sbox template modulefile can be found in here.
Requirements
Sbox requires Slurm and Python >= 3.6.8. The interactive jupyter
command requires Anaconda and an environment module system
(e.g. Lmod) in addition to
Slurm and Python. To use R and Julia in JupyterLab sessions, we need R
and irkernel as well as Julia to be installed.
Note that Sbox options require some other commands. Review their requirements under the command line options.
The following shows how to install Anaconda and create the required virtual envs and modulefiles.
Python kernel (Anaconda)
The interactive jupyter
command provides a JupyterLab interface for
running Python and many scientific packages by using Anaconda. To
install Anaconda, find the latest version of Anaconda from
here and run:
wget https://repo.anaconda.com/archive/Anaconda3-<year.month>-Linux-x86_64.sh
bash Anaconda3-<year.month>-Linux-x86_64.sh -b -p /<cluster software path>/anaconda/<year.month>
In the above lines, update <year.month>
(e.g. 2021.05
) based on
the Anaconda version and <cluster software path>
(e.g. /cluster/software/
) based on the cluster path.
To load Anaconda by modeule load
command, create the following
modeulefile under $MODULEPATH/anaconda/<year.month>.lua
:
-- -*- lua -*-
whatis([[Name : anaconda]])
whatis([[Version : <year.month>]])
whatis([[Target : x86_64]])
whatis([[Short description : Python3 distribution including conda and 250+ scientific packages.]])
help([[Python3 distribution including conda and 250+ scientific packages.]])
-- Create environment variables
local this_root = "/<cluster software path>/anaconda/<year.month>"
prepend_path("PATH", this_root .. "/bin", ":")
prepend_path("LIBRARY_PATH", this_root .. "/lib", ":")
prepend_path("LD_LIBRARY_PATH", this_root .. "/lib", ":")
prepend_path("MANPATH", this_root .. "/share/man", ":")
prepend_path("INCLUDE", this_root .. "/include", ":")
prepend_path("C_INCLUDE_PATH", this_root .. "/include", ":")
prepend_path("CPLUS_INCLUDE_PATH", this_root .. "/include", ":")
prepend_path("PKG_CONFIG_PATH", this_root .. "/lib/pkgconfig", ":")
setenv("ANACONDA_ROOT", this_root)
Or adding the following tcl modulefile under
$MODULEPATH/anaconda/<year.month>
:
#%Module1.0
## Metadata ###########################################
set this_module anaconda
set this_version <year.month>
set this_root /<cluster software path>/${this_module}/${this_version}
set this_docs https://docs.anaconda.com/
set this_module_upper [string toupper ${this_module}]
## Module #############################################
proc ModulesHelp { } {
global this_module this_version this_root this_docs
puts stderr "****************************************************"
puts stderr "Name: ${this_module}"
puts stderr "Version: ${this_version}"
puts stderr "Documentation: ${this_docs}"
puts stderr "****************************************************\n"
}
module-whatis "Set up environment for ${this_module} ${this_version}"
prepend-path PATH ${this_root}/bin
prepend-path LIBRARY_PATH ${this_root}/lib
prepend-path LD_LIBRARY_PATH ${this_root}/lib
prepend-path MANPATH ${this_root}/share/man
prepend-path INCLUDE ${this_root}/include
prepend-path C_INCLUDE_PATH ${this_root}/include
prepend-path CPLUS_INCLUDE_PATH ${this_root}/include
prepend-path PKG_CONFIG_PATH ${this_root}/lib/pkgconfig
setenv ${this_module_upper}_ROOT ${this_root}
R kernel
Users can run R scripts within a JupterLab notebook by
interactive jupyter -k r
. To have R, irkernel and many other R
packages, we can create the following env including
r-essentials
from Anaconda:
cd /<cluster software path>/anaconda/<year.month>
./bin/conda create -n r-essentials-<R version> -c conda-forge r-essentials r-base r-irkernel jupyterlab
In the above lines, <cluster software path>
and <year.month>
should be updated based on the Anaconda path and version, and
<R version>
(e.g. 4.0.3
) based on the version of R in the env.
The following modulefile should be added to
$MODULEPATH/r-essentials/<R version>.lua
to be able to load the R
env:
-- -*- lua -*-
whatis([[Name : r-essentials]])
whatis([[Version : <R version>]])
whatis([[Target : x86_64]])
whatis([[Short description : A conda environment for R and 80+ scientific packages.]])
help([[A conda environment for R and 80+ scientific packages.]])
-- Create environment variables
local this_root = "/<cluster software path>/anaconda/envs/r-essentials-<R version>"
prepend_path("PATH", this_root .. "/bin", ":")
prepend_path("LIBRARY_PATH", this_root .. "/lib", ":")
prepend_path("LD_LIBRARY_PATH", this_root .. "/lib", ":")
prepend_path("MANPATH", this_root .. "/share/man", ":")
prepend_path("INCLUDE", this_root .. "/include", ":")
prepend_path("C_INCLUDE_PATH", this_root .. "/include", ":")
prepend_path("CPLUS_INCLUDE_PATH", this_root .. "/include", ":")
prepend_path("PKG_CONFIG_PATH", this_root .. "/lib/pkgconfig", ":")
setenv("ANACONDA_ROOT", this_root)
Or adding a tcl modulefile similar to the above tcl template for Anaconda.
Julia kernel
The interactive jupyter -k julia
command provides Julia from a
JupyterLab notebook. Julia can be installed from
Spack,
source or
Anaconda. The following
shows how to install Julia from Anaconda (Note that if Julia have been
installed on the cluster, you can skip this section and use the
available Julia module instead).
cd /<cluster software path>/anaconda/<year.month>
./bin/conda create -n julia-<version> -c conda-forge julia
In the above lines, <cluster software path>
and <year.month>
should be updated based on the Anaconda path and version, and
<version>
(e.g. 1.6.1
) based on the version of Julia in the env.
The following modulefile should be added to
$MODULEPATH/julia/<version>.lua
:
-- -*- lua -*-
whatis([[Name : julia]])
whatis([[Version : <version>]])
whatis([[Target : x86_64]])
whatis([[Short description : The Julia Language: A fresh approach to technical computing]])
help([[The Julia Language: A fresh approach to technical computing]])
-- Create environment variables
local this_root = "/<cluster software path>/anaconda/envs/julia-<version>"
prepend_path("PATH", this_root .. "/bin", ":")
prepend_path("LIBRARY_PATH", this_root .. "/lib", ":")
prepend_path("LD_LIBRARY_PATH", this_root .. "/lib", ":")
prepend_path("MANPATH", this_root .. "/share/man", ":")
prepend_path("INCLUDE", this_root .. "/include", ":")
prepend_path("C_INCLUDE_PATH", this_root .. "/include", ":")
prepend_path("CPLUS_INCLUDE_PATH", this_root .. "/include", ":")
prepend_path("PKG_CONFIG_PATH", this_root .. "/lib/pkgconfig", ":")
setenv("ANACONDA_ROOT", this_root)
Or adding a tcl modulefile similar to the above tcl template for Anaconda.
Note that the first time that users run
interactive jupyter -k julia
, Julia Jupyter kernal (IJulia) will be
installed under ~/.julia
.
On demand Python and R pakages
Popular Python pakages that are not available in Anaconda can be added
to interactive jupyter -e
. For instance the following shows how to
create a TensorFlow (TF) env:
cd /<cluster software path>/anaconda/<year.month>
./bin/conda create -n tensorflow-gpu-<version> anaconda
./bin/conda install -n tensorflow-gpu-<version> tensorflow-gpu gpustat
Similarly, we can create a PyTorch (PT) env with:
cd /<cluster software path>/anaconda/<year.month>
./bin/conda create -n pytorch-<version> anaconda
./bin/conda install -n pytorch-<version> -c pytorch pytorch gpustat
For instance, we can collect popular R bio packages in the following env from bioconda channel:
cd /<cluster software path>/anaconda/<year.month>
./bin/conda create -n r-bioessentials-<version> -c bioconda -c conda-forge bioconductor-edger bioconductor-oligo r-monocle3 r-signac r-seurat scanpy macs2 jupyterlab r-irkernel
In the above lines, <cluster software path>
and <year.month>
should be updated based on the Anaconda path and version, and
<version>
(e.g. 2.4.1
) based on the version of TF, PT, or R.
For each env, we need to add a modulefile to
$MODULEPATH/<env name>/<version>.lua
. For instance
$MODULEPATH/tensorflow/<version>.lua
is:
-- -*- lua -*-
whatis([[Name : tensorflow]])
whatis([[Version : <version>]])
whatis([[Target : x86_64]])
whatis([[Short description : Python3 distribution including TensorFlow and 250+ scientific packages.]])
help([[Python3 distribution including TensorFlow and 250+ scientific packages.]])
-- Create environment variables
local this_root = "/<cluster software path>/anaconda/envs/tensorflow-gpu-<version>"
prepend_path("PATH", this_root .. "/bin", ":")
prepend_path("LIBRARY_PATH", this_root .. "/lib", ":")
prepend_path("LD_LIBRARY_PATH", this_root .. "/lib", ":")
prepend_path("MANPATH", this_root .. "/share/man", ":")
prepend_path("INCLUDE", this_root .. "/include", ":")
prepend_path("C_INCLUDE_PATH", this_root .. "/include", ":")
prepend_path("CPLUS_INCLUDE_PATH", this_root .. "/include", ":")
prepend_path("PKG_CONFIG_PATH", this_root .. "/lib/pkgconfig", ":")
setenv("ANACONDA_ROOT", this_root)
Or adding a tcl modulefile similar to the above tcl template for Anaconda.
Note: Users can add other packages and mix a local stack of packages
with the premade environments. For Python and R packages users can apply
pip install
and install.packages
respectively to install
packages on their home. In order to install packages in a differnt path
than home, we can specify the desired path and add the new path to the
library path of the software. See examples under the interactive
command line options examples.
Configuration
The sbox
and interactive
commands are reading the required
information from the below JSON config file.
{
"disk_quota_paths": [],
"cpu_partition": [],
"gpu_partition": [],
"interactive_partition_timelimit": {},
"jupyter_partition_timelimit": {},
"partition_qos": {},
"kernel_module": {},
"env_module": {}
}
The config file includes:
disk_quota_paths
: A list of paths to the disk for finding users quotas. By default the first input is considered as the users’ home path.cpu_partition
: A list of computational partitions.gpu_partition
: A list of GPU partitions.interactive_partition_timelimit
: A dictionary of interactive partitions (i.e. users should access bysrun
) and their time limits (hour). The first input is considered as the default partition.jupyter_partition_timelimit
: A dictionary of computational/gpu partitions that users can run Jupter servers interactively and their time limits (hour). The first input is considered as the default partition.partition_qos
: A dictionary of partitions and the corresponding quality of services.kernel_module
: A dictionary of kernels and the corresponding modules. A Python kernel is required (review the Requirments).env_module
: A dictionary of virtual environments and the corresponding modules.
For example:
{
"disk_quota_paths": ["/home", "/data", "/gprs", "/storage/htc"],
"cpu_partition": ["Interactive","Lewis","Serial","Dtn","hpc3","hpc4","hpc4rc","hpc5","hpc6","General","Gpu"],
"gpu_partition": ["Gpu","gpu3","gpu4"],
"interactive_partition_timelimit": {
"Interactive": 4,
"Dtn": 4,
"Gpu": 2
},
"jupyter_partition_timelimit": {
"Lewis": 8,
"hpc4": 8,
"hpc5": 8,
"hpc6": 8,
"gpu3": 8,
"gpu4": 8,
"Gpu": 2
},
"partition_qos": {
"Interactive": "interactive",
"Serial": "seriallong",
"Dtn": "dtn"
},
"kernel_module": {
"python": "anaconda",
"r": "r-essentials",
"julia": "julia"
},
"env_module": {
"tensorflow-v1.9": "tensorflow/1.9.0",
"tensorflow": "tensorflow",
"pytorch": "pytorch",
"r-bio": "r-bioessentials"
}
}