- Environment Modules
- How to
- List the available software
- Load a software
- Unload a software
- Software stacking
- Conda
- Shared Conda environments
- Why ask for a shared environment
- "Private" Conda environments
- Cons of Conda
- Singularity
To provide a software environment, we rely on 2 main technologies: Conda and Singularity depending of the software specificities, their license...
In order to offer a unify user interface, we implement on top, Environment Modules
Environment Modules
The Environment Modules package is a tool that simplify shell initialization and lets users easily modify their environment during the session with modulefiles. Each modulefile contains the information needed to configure the shell for an application. Modules can be loaded and unloaded dynamically and atomically, in an clean fashion.
Modules are useful in managing different versions of applications. Modules can also be bundled into metamodules that will load an entire suite of different applications.
In our case, within the IFB [Core] Cluster[s], the Conda environment or the Singularity image will be loaded through a Module environment : One Conda environment, One Modulefile.
But why using module load fastqc/0.11.7
instead of conda activate fastqc-0.11.7
?
- Module will provide a useful autocompletion to help you in searching a tool and a version
module load snp<TAB><TAB>
,module load fasqtc<TAB><TAB>
. - Module will be able to load either Conda environment or Singularity wrappers. That way, one loader for different underlying technologies.
How to
List the available software
module avail
Load a software
module load fastqc/0.11.7
Unload a software
module unload fastqc/0.11.7
Software stacking
Conda/Modules can be stacked if you need several software at once.
module load trinity/2.8.4 fastqc/0.11.7
module load snakemake/5.3.0
But in case of incompatibilities, as for example two software which require python2 and python3, it's recommanded to load the software just before using it.
$ module load trinity/2.8.4
Trinity --seqType fq --max_memory 50G --left reads_1.fq.gz --right reads_2.fq.gz --CPU 6
$ module unload trinity/2.8.4
$ module load fastqc/0.11.7
fastqc Trinity.fas
$ module unload fastqc/0.11.7
Conda
Most of the tools need some requirements. Some need lot of requirements: Python or R libraries in specific version, the last brand new compilator, or simply a newer one, ... Often, those dependencies are not compatibles with what we already have on our system or between them.
Conda is an open source package, dependency and environment manager for any language: Python, R, Ruby, Lua, Scala, Java, Javascript, C/ C++, FORTRAN. Miniconda is a small “bootstrap” version that includes only conda, Python, and the packages they depend on. Over 720 scientific packages and their dependencies can be installed individually from the Continuum repository with the conda install
command.
At IGBMC, it will allow us to install tools within some dedicated and isolated environments.
Note that all software are not provided by Conda. Meanwhile, IGBMC is contributing to add package in Conda through the GitHub repository Bioconda.
Shared Conda environments
We are installing software and software environments within Miniconda3.
Why ask for a shared environment
There are different use cases:
- I don't know how to use Conda
- I'm preparing a training session and I want that all the attendees have the same software environment
- Conda packages can be heavy in term of disk usage
To request a tool or a Conda environment, 2 solutions:
- Propose one via our dedicated git repository cluster/tools (shared with the IFB)
- Request a tool : Outils de calcul
To know if a package and a specific version is available in the channels bioconda
, conda-forge
and default
:
conda search -c conda-forge -c bioconda mu_tool
"Private" Conda environments
We don't recommand installing tools on your own if the require tools is available as a Conda package in Bioconda or Conda-forge channels
Because your ~
directory isn't design to store lot of files. If you really want to install Conda packages, please install them on your project directory.
To do that, you need to edit a configuration file ~/.condarc
~/.condarc
envs_dirs:
- /shared/projects/<project_name>/conda/env
pkgs_dirs:
- /shared/projects/<project_name>/conda/pkgs
Cons of Conda
Because, those environments isolate the software. The other Python, R or Perl libraries which are installed on the system or within other Conda environments are not availables within the Conda environment. If it is an issue for you, let us know.
Singularity
Singularity is a free, cross-platform and open-source computer program that performs operating-system-level virtualization also known as containerization.
One of the main uses of Singularity is to bring containers and reproducibility to scientific computing and the high-performance computing (HPC) world.
For more information, please visite this page: Singularity advanced guide