Difference between revisions of "Conda"

From UFRC
Jump to navigation Jump to search
 
(55 intermediate revisions by 5 users not shown)
Line 1: Line 1:
[[Category:Software]]
+
[[Category:Software]][[Category:Programming]]
 
{|<!--CONFIGURATION: REQUIRED-->
 
{|<!--CONFIGURATION: REQUIRED-->
 
|{{#vardefine:app|conda}}
 
|{{#vardefine:app|conda}}
 
|{{#vardefine:url|https://docs.conda.io/en/latest/miniconda.html}}
 
|{{#vardefine:url|https://docs.conda.io/en/latest/miniconda.html}}
 
<!--CONFIGURATION: OPTIONAL (|1}} means it's ON)-->
 
<!--CONFIGURATION: OPTIONAL (|1}} means it's ON)-->
|{{#vardefine:conf|}}          <!--CONFIGURATION-->
+
|{{#vardefine:conf|1}}          <!--CONFIGURATION-->
 
|{{#vardefine:exe|1}}            <!--ADDITIONAL INFO-->
 
|{{#vardefine:exe|1}}            <!--ADDITIONAL INFO-->
 
|{{#vardefine:job|}}            <!--JOB SCRIPTS-->
 
|{{#vardefine:job|}}            <!--JOB SCRIPTS-->
Line 13: Line 13:
 
|{{#vardefine:installation|}} <!--INSTALLATION-->
 
|{{#vardefine:installation|}} <!--INSTALLATION-->
 
|}
 
|}
 +
For help on creating and managing personal environments whether for command-tool use or python package use in SLURM jobs or Jupyter kernels see [https://help.rc.ufl.edu/doc/Managing_Python_environments_and_Jupyter_kernels Managing Python environments and Jupyter kernels]
 
<!--BODY-->
 
<!--BODY-->
 
<!--Description-->
 
<!--Description-->
Line 18: Line 19:
 
{{App_Description|app={{#var:app}}|url={{#var:url}}|name={{#var:app}}}}|}}
 
{{App_Description|app={{#var:app}}|url={{#var:url}}|name={{#var:app}}}}|}}
  
Conda is an open-source package management system and environment management system that runs on Windows, macOS, and Linux. Conda quickly installs, runs, and updates packages and their dependencies. Conda easily creates, saves, loads, and switches between environments. Separating applications in separate conda environments allows installation of incompatible dependencies - python2 and python3 for example.  
+
Conda is an open-source package management system and environment management system that runs on Windows, macOS, and Linux. Conda quickly installs, runs, and updates packages and their dependencies. Conda easily creates, saves, loads, and switches between environments. Separating applications in separate conda environments allows installation of incompatible dependencies - python2 and python3 for example.
  
Miniconda is a free minimal installer for conda. It is a small, bootstrap version of Anaconda that includes only conda, Python, the packages they depend on, and a small number of other useful packages, including pip, zlib and a few others. Use the conda install command to install additional conda packages from repositories (channels) like [https://bioconda.github.io/ bioconda].
+
'''Notes:'''
 +
* For a ''faster conda'' see [[Mamba]].
 +
* Do not use the 'Anaconda Distribution' to install conda/mamba. Use an open source [https://github.com/conda-forge/miniforge MiniForge] installer or, better yet, our 'conda' environment module and open source repositories such as conda-forge and bioconda. We provide a 'known good' ~/.condarc configuration script that gets installed on the first 'module load conda' run.
  
 
<!--Modules-->
 
<!--Modules-->
Line 29: Line 32:
 
* HPC_{{uc:{{#var:app}}}}_BIN - executable directory
 
* HPC_{{uc:{{#var:app}}}}_BIN - executable directory
 
<!--Configuration-->
 
<!--Configuration-->
{{#if: {{#var: conf}}|==Configuration==
+
{{#if: {{#var: conf}}| ==Background==
See the [[{{PAGENAME}}_Configuration]] page for {{#var: app}} configuration details.
+
<onlyinclude>
 +
Many projects that use [[Python]] code require careful management of the respective Python environments. Rapid changes in package dependencies, package version conflicts, deprecation of APIs (function calls) by individual projects, and obsolescence of system drivers and libraries make it virtually impossible to use an arbitrary set of packages or create one all-encompassing environment that will serve everyone's needs over long periods of time. The high velocity of changes in the popular ML/DL frameworks and packages and GPU computing exacerbates the problem.
 +
 
 +
== The problem with <code>pip install</code> ==
 +
<div class="mw-collapsible mw-collapsed" style="width:70%; padding: 5px; border: 1px solid gray;">
 +
''Expand this section to view pip problems and how conda/mamba mends them.''
 +
<div class="mw-collapsible-content" style="padding: 5px;">
 +
Most guides and project documentation for installing python packages recommend using <code>pip install</code> for package installation. While <code>pip</code> is easy to use and works for many use cases, there are some major drawbacks. There are a few issues with doing <code>pip install</code> on a supercomputer like HiPerGator:
 +
 
 +
* Pip by default installs binary packages (wheels), which are often built on systems incompatible with HiPerGator. This can lead to importing errors, and its attempts to build from source will fail without additional configuration.
 +
* If you are pip installing a package that is/will be installed in an environment provided by UFRC, your pip version will take precedence. Your dependencies eventually become incompatible causing errors, with even one pip install making environments unusable.
 +
* Different packages may require different versions of the same package as dependencies leading to impossible to reconcile installation scenarios. This becomes a challenge to manage with <code>pip</code> as there isn't a method to swap active versions.
 +
* On its own, `pip` installs **everything** in one location: <code>~/.local/lib/python3.X/site-packages/</code>.
 +
 
 +
<big><big><strong>Conda and Mamba to the rescue!</strong></big></big>
 +
[[File:Mamba.png |left | 200px]]<code>conda</code> and the newer, faster, drop-in replacement <code>mamba</code>, were written to solve some of these issues. They represent a higher level of packaging abstraction that can combine compiled packages, applications, and libraries as well as </code>pip</code>-installed python packages. They also allow easier management of project-specific environments and switching between environments as needed. They make it much easier to report the exact configuration of packages in an environment, facilitating reproducibility. Moreover, conda environments don't even have to be activated to be used; in most cases adding the path to the conda environment's 'bin' directory to the $PATH in the shell environment is sufficient for using them.
 +
 
 +
<big>'''A caveat'''</big>
 +
 
 +
<code>conda</code> and <code>mamba</code> get packages from channels, or repositories of prebuilt packages. While there are several available channels, like the <code>conda-forge</code> or <code>bioconda</code>, not every Python package is available from such channel as they have to be packaged for conda first. You may still need to use <code>pip</code> to install some packages as noted later. '''However, <code>conda</code> still helps manage environment by installing packages into separate directory trees rather than trying to install all packages into a single folder that pip does.'''
 +
</div></div>
 +
 
 +
==Configuration==
 +
<div class="mw-collapsible mw-collapsed" style="width:70%; padding: 5px; border: 1px solid gray;">
 +
''Expand this section to view instructions for configuring Conda''
 +
<div class="mw-collapsible-content" style="padding: 5px;">
 +
<!-- See the [[{{PAGENAME}}_Configuration]] page for {{#var: app}} configuration details. -->
 +
'''<big>The <code>~/.condarc</code> configuration file</big>'''
 +
 
 +
<code>conda</code>'s behavior is controlled by a configuration file in your home directory called <code>.condarc</code>. The dot at the start of the name means that the file is hidden from 'ls' file listing command by default. If you have not run <code>conda</code> before, you won't have this file. Whether the file exists or not, the steps here will help you modify the file to work best on HiPerGator. First load of the <code>conda</code> environment module on HiPerGator will put the current ''best practice''  <code>.condarc</code> into your home directory.
 +
 
 +
'''<big><code>conda</code> package cache location</big>'''
 +
 
 +
<code>conda</code> caches (keeps a copy) of all downloaded packages by default in the <code>~/.conda/pkgs</code> directory tree. If you install a lot of packages you may end up filling up your home quota. You can change the default package cache path. To do so, add or change the <code>pkgs_dirs</code> setting in your <code>~/.condarc</code> configuration file e.g.:
 +
<pre>
 +
pkgs_dirs:
 +
  - /blue/mygroup/$USER/conda/pkgs
 +
</pre>
 +
Replace <code>mygroup</code> with your actual group name.
 +
 
 +
'''<big><code>conda</code> environment location</big>'''
 +
 
 +
<code>conda</code> puts all packages installed in a particular environment into a single directory. By default ''named'' <code>conda</code> environments are created in the <code>~/.conda/envs</code> directory tree. They can quickly grow in size and, especially if you have many environments, fill the 40GB home directory quota. For example, the environment we will create in this training is 5.3GB in size. As such, it is important to use ''path'' based (conda create -p PATH) conda environments, which allow you to use any path for a particular environment for example allowing you to keep a project-specific conda environment close to the project data in </code>/blue/</code> where you group has terrabyte(s) of space.
 +
 
 +
You can also change the default path for the ''name'' environments (<code>conda create -n NAME</code>) if you prefer to keep all <code>conda</code> environments in the same directory tree. To do so, add or change the <code>envs_dirs</code> setting in the <code>~/.condarc</code> configuration file e.g.:
 +
<pre>
 +
envs_dirs:
 +
  - /blue/mygroup/share/conda/envs
 +
  #or alternatively: - /blue/mygroup/$USER/conda/envs
 +
</pre>
 +
Replace <code>mygroup</code> with your actual group name.
 +
 
 +
One way to edit your <code>~/.condarc</code> file is to type: <code>nano ~/.condarc`</code>
 +
 
 +
If the file is empty, paste in the text below, editing the <code>env_dirs:</code> and <code>pkg_dirs</code> as below. If the file has contents, update those lines.
 +
 
 +
{{Note|Your <code>~/.condarc</code> should look something like this when you are done editing (again, replacing <code>group</code> and <code>user</code> in the paths with your group and username).|note}}
 +
 
 +
<pre>
 +
channels:
 +
- conda-forge
 +
- bioconda
 +
- defaults
 +
envs_dirs:
 +
- /blue/group/user/conda/envs
 +
pkgs_dirs:
 +
- /blue/group/user/conda/pkgs
 +
auto_activate_base: false
 +
auto_update_conda: false
 +
always_yes: false
 +
show_channel_urls: false
 +
</pre>
 +
</div>
 +
</div>
 +
<div class="mw-collapsible-content" style="padding: 5px;">
 +
<noinclude>
 
|}}
 
|}}
 
<!--Run-->
 
<!--Run-->
{{#if: {{#var: exe}}|==Additional Information==
+
{{#if: {{#var: exe}}|
 +
</noinclude>==Create and activate a Conda environment==
 +
<div class="mw-collapsible mw-collapsed" style="width:70%; padding: 5px; border: 1px solid gray;">
 +
''Expand this section to view instructions for setting up environments.''
 +
<div class="mw-collapsible-content" style="padding: 5px;">
 
UF Research Computing Applications Team uses conda for many application installs behind the scenes. We are happy to [https://support.rc.ufl.edu install applications on request] for you. However, if you would like to use conda to create multiple environments for your personal projects we encourage you to do so. Here are some recommendations for successful conda use on HiPerGator.
 
UF Research Computing Applications Team uses conda for many application installs behind the scenes. We are happy to [https://support.rc.ufl.edu install applications on request] for you. However, if you would like to use conda to create multiple environments for your personal projects we encourage you to do so. Here are some recommendations for successful conda use on HiPerGator.
 +
*See [https://docs.conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html https://docs.conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html] for the original documentation on managing conda environments.
 +
*We recommend creating environments by 'path', so they won't fill up your home directory (check quota with home_quota). The resulting environment should be located in the project(s) directory tree in /blue for better tracking of installs  and better filesystem performance compared to home.
  
* We suggest not installing your own copy of miniconda or anaconda, but using one of our conda or [[Mamba|mamba]] environment modules to get the 'conda engine' ''for free''. The only difference is that a personal miniconda3 install will overwrite your ~/.bashrc to make creating environments by name (with the -n argument) in your ~/.conda/envs directory more straightforward and the 'conda activate' command available in your shell environment. We don't use that approach for our work and do not recommend it.
+
{{Note|'''If you plan on using a GPU''' see below|warn}}
 +
To make sure your code will run on GPUs install a recent <code>cudatoolkit</code> package that works with the NVIDIA drivers on HPG (currently 12.x, but older versions are still supported) alongside the pytorch or tensorflow package(s). See RC provided tensorflow or pytorch installs for examples if needed. Mamba can detect if there is a gpu in the environment, so the easiest approach is to run the mamba install command in a gpu session. Alternatively, you can run mamba install on any node or if a cpu-only pytorch package was already installed by explicitly requiring a gpu version of pytorch when running mamba install. E.g.
 +
mamba install cudatoolkit=11.3 pytorch=1.12.1=gpu_cuda* -c pytorch
 +
'''<big>Load the <code>conda</code> module</big>'''
  
We would like to suggest using '-p' argument to create environments by 'path', so they wouldn't fill up your home directory (check quota with home_quota) and could be located conveniently close to the project(s) they serve in /blue for better tracking of installs
+
Before we can run <code>conda</code> or <code>mamba</code> on HiPerGator, we need to load the <code>conda</code> module:
  
Use
+
  module load conda
  conda create -yp /path/to/the/environment
 
command instead of 'conda create -n name_of_the_env'
 
  
The above has the advantage over the default conda behavior of creating envs by name in your home directory that you can create an environment in blue and not run the risk of filling up your home_quota. It's also much easier to keep track of the many environments if they are located closer to the projects they serve rather than being dumped into ~/.conda/envs as conda usually does by default. Note, as described in
+
<big>'''Create your environment'''</big>
  
* We would like to point out that there is a clear distinction between using conda environments to install packages and using applications installed in those environments. To use a conda environment you need to add its 'bin' directory to your shell $PATH whether in an interactive session or a job script submitted to the scheduler. To modify a conda environment by installing or removing packages you need to 'activate' the environment. Because of that distinction we strongly recommend against allowing miniconda to insert its activation code into your shell initialization file ~/.bashrc. If you already let it do so please remove the offending code from ~/.bashrc. It's the text within and including the
+
'''Create a ''name based'' environment'''
<pre>
+
[[File:Mamba create.png|frameless|right|250px]]
# >>> conda initialize >>>
+
 
...
+
To create your first ''name based''  (see path based instructions below)<code>conda</code> environment, run the following command. In this example, I am creating an environment named <code>hfrl</code>:
# <<< conda initialize <<<
+
 
</pre>
+
mamba create -n hfrl
 +
 
 +
The screenshot to the right is the output from running that command. Yours should look similar.
 +
 
 +
'''Note:
 +
# You do not need to manually create the folders that you setup in your <code>~/.condarc</code> file. <code>mamba</code> will take care of that for you.
 +
# When creating a Conda environment you can also install Conda packages as needed at the same time. i.e:
 +
<code>mamba create -n hfrl python=3.9 pytorch numpy=2.22</code>'''
 +
 
 +
'''Create a ''path based'' environment'''
 +
 
 +
To create a ''path based'' <code>conda</code> environment use the '-p PATH' argument:
 +
<code>mamba create -p PATH</code>
 +
e.g.
 +
<code>mamba create -p /blue/mygroup/share/project42/conda/envs/hfrl/</code>
 +
 
 +
<big>'''Activate the new environment'''</big>
 +
 
 +
To activate our environment (whether created with <code>mamba</code> or </code>conda</code> we use the <code>conda activate env_name</code> command. Let's activate our new environment:
 +
 
 +
<code>conda activate hfrl</code>
 +
 
 +
or
  
lines.
+
<code>conda activate /blue/mygroup/share/project42/conda/envs/hfrl/</code>
  
Instead, load our conda environment module, which we keep up-to-date with conda releases any time you need to create conda environments and to install packages in them. The difference is that 'conda activate' command will not be available, so you will have to use 'source activate' command instead.
+
Notice that your command prompt changes when you activate an environment to indicate which environment is active, showing that in parentheses before the other information:
  
E.g.
+
<pre> (hfrl) [magitz@c0907a-s23 magitz]$ </pre>
  
$ module load conda
+
'''Note: ''path based'' environment activation is really only needed for package installation. For using the environment just add the path to its <code>bin</code> directory to $PATH in your job script.'''
$ source activate /path/to/the/environment
 
  
 
Once you are done installing packages inside the environment you can use
 
Once you are done installing packages inside the environment you can use
Line 71: Line 177:
 
If you have a project-specific conda environment at '/home/myuser/envs/project1/' add the following into your job script before executing any commands
 
If you have a project-specific conda environment at '/home/myuser/envs/project1/' add the following into your job script before executing any commands
 
  export PATH=/home/myuser/envs/project1/bin:$PATH
 
  export PATH=/home/myuser/envs/project1/bin:$PATH
 +
</div></div>
 +
 +
==Export or import an environment==
 +
<div class="mw-collapsible mw-collapsed" style="width:70%; padding: 5px; border: 1px solid gray;">
 +
''Expand this section to view instructions.''
 +
<div class="mw-collapsible-content" style="padding: 5px;">
 +
<big>'''Export your environment to an <code>environment.yml</code> file'''</big>
 +
 +
Now that you have your environment working, you may want to document its contents and/or share it with others. The <code>environment.yml</code> file defines the environment and can be used to build a new environment with the same setup.
 +
 +
To export an environment file from an existing environment, run:
 +
 +
<code>conda env export > hfrl.yml</code>
 +
 +
You can inspect the contents of this file with <code>cat hfrl.yml</code>. This file defines the packages and versions that make up the environment as it is at this point in time. Note that it also includes packages that were installed via <code>pip</code>.
 +
 +
'''<big>Create an environment from a yaml file</big>'''
 +
 +
If you share the environment yaml file created above with another user, they can create a copy of your environment using the command:
 +
 +
<code>conda env create --file hfrl.yml</code>
 +
 +
They may need to edit the last line to change the location to match where they want their environment created.
 +
</div></div>
 +
 +
== Group environments ==
  
 +
It is possible to create a shared environment accessed by a group on HiPerGator, storing the environment in, for example, <code>/blue/group/share/conda</code>. In general, this works best if only one user has write access to the environment. All installs should be made by that one user and should be communicated with the other users in the group. It is recommended that user's umask configuration is set to group friendly permissions, such as umask 007. See [[Sharing Within A Cluster]].
 +
</onlyinclude>
 
|}}
 
|}}
 
<!--Job Scripts-->
 
<!--Job Scripts-->
Line 103: Line 237:
 
See the [[{{PAGENAME}}_Install]] page for {{#var: app}} installation notes.|}}
 
See the [[{{PAGENAME}}_Install]] page for {{#var: app}} installation notes.|}}
 
<!--Turn the Table of Contents and Edit paragraph links ON/OFF-->
 
<!--Turn the Table of Contents and Edit paragraph links ON/OFF-->
__NOTOC____NOEDITSECTION__
+
__NOEDITSECTION__
 +
<nowiki>__NOTOC__</nowiki>

Latest revision as of 19:22, 23 August 2024

For help on creating and managing personal environments whether for command-tool use or python package use in SLURM jobs or Jupyter kernels see Managing Python environments and Jupyter kernels

Description

conda website  

Conda is an open-source package management system and environment management system that runs on Windows, macOS, and Linux. Conda quickly installs, runs, and updates packages and their dependencies. Conda easily creates, saves, loads, and switches between environments. Separating applications in separate conda environments allows installation of incompatible dependencies - python2 and python3 for example.

Notes:

  • For a faster conda see Mamba.
  • Do not use the 'Anaconda Distribution' to install conda/mamba. Use an open source MiniForge installer or, better yet, our 'conda' environment module and open source repositories such as conda-forge and bioconda. We provide a 'known good' ~/.condarc configuration script that gets installed on the first 'module load conda' run.

Environment Modules

Run module spider conda to find out what environment modules are available for this application.

System Variables

  • HPC_CONDA_DIR - installation directory
  • HPC_CONDA_BIN - executable directory

Background

Many projects that use Python code require careful management of the respective Python environments. Rapid changes in package dependencies, package version conflicts, deprecation of APIs (function calls) by individual projects, and obsolescence of system drivers and libraries make it virtually impossible to use an arbitrary set of packages or create one all-encompassing environment that will serve everyone's needs over long periods of time. The high velocity of changes in the popular ML/DL frameworks and packages and GPU computing exacerbates the problem.

The problem with pip install

Expand this section to view pip problems and how conda/mamba mends them.

Most guides and project documentation for installing python packages recommend using pip install for package installation. While pip is easy to use and works for many use cases, there are some major drawbacks. There are a few issues with doing pip install on a supercomputer like HiPerGator:

  • Pip by default installs binary packages (wheels), which are often built on systems incompatible with HiPerGator. This can lead to importing errors, and its attempts to build from source will fail without additional configuration.
  • If you are pip installing a package that is/will be installed in an environment provided by UFRC, your pip version will take precedence. Your dependencies eventually become incompatible causing errors, with even one pip install making environments unusable.
  • Different packages may require different versions of the same package as dependencies leading to impossible to reconcile installation scenarios. This becomes a challenge to manage with pip as there isn't a method to swap active versions.
  • On its own, `pip` installs **everything** in one location: ~/.local/lib/python3.X/site-packages/.

Conda and Mamba to the rescue!

Mamba.png
conda and the newer, faster, drop-in replacement mamba, were written to solve some of these issues. They represent a higher level of packaging abstraction that can combine compiled packages, applications, and libraries as well as pip-installed python packages. They also allow easier management of project-specific environments and switching between environments as needed. They make it much easier to report the exact configuration of packages in an environment, facilitating reproducibility. Moreover, conda environments don't even have to be activated to be used; in most cases adding the path to the conda environment's 'bin' directory to the $PATH in the shell environment is sufficient for using them.

A caveat

conda and mamba get packages from channels, or repositories of prebuilt packages. While there are several available channels, like the conda-forge or bioconda, not every Python package is available from such channel as they have to be packaged for conda first. You may still need to use pip to install some packages as noted later. However, conda still helps manage environment by installing packages into separate directory trees rather than trying to install all packages into a single folder that pip does.

Configuration

Expand this section to view instructions for configuring Conda

The ~/.condarc configuration file

conda's behavior is controlled by a configuration file in your home directory called .condarc. The dot at the start of the name means that the file is hidden from 'ls' file listing command by default. If you have not run conda before, you won't have this file. Whether the file exists or not, the steps here will help you modify the file to work best on HiPerGator. First load of the conda environment module on HiPerGator will put the current best practice .condarc into your home directory.

conda package cache location

conda caches (keeps a copy) of all downloaded packages by default in the ~/.conda/pkgs directory tree. If you install a lot of packages you may end up filling up your home quota. You can change the default package cache path. To do so, add or change the pkgs_dirs setting in your ~/.condarc configuration file e.g.:

pkgs_dirs:
  - /blue/mygroup/$USER/conda/pkgs

Replace mygroup with your actual group name.

conda environment location

conda puts all packages installed in a particular environment into a single directory. By default named conda environments are created in the ~/.conda/envs directory tree. They can quickly grow in size and, especially if you have many environments, fill the 40GB home directory quota. For example, the environment we will create in this training is 5.3GB in size. As such, it is important to use path based (conda create -p PATH) conda environments, which allow you to use any path for a particular environment for example allowing you to keep a project-specific conda environment close to the project data in /blue/ where you group has terrabyte(s) of space.

You can also change the default path for the name environments (conda create -n NAME) if you prefer to keep all conda environments in the same directory tree. To do so, add or change the envs_dirs setting in the ~/.condarc configuration file e.g.:

envs_dirs:
  - /blue/mygroup/share/conda/envs
  #or alternatively: - /blue/mygroup/$USER/conda/envs

Replace mygroup with your actual group name.

One way to edit your ~/.condarc file is to type: nano ~/.condarc`

If the file is empty, paste in the text below, editing the env_dirs: and pkg_dirs as below. If the file has contents, update those lines.

Your ~/.condarc should look something like this when you are done editing (again, replacing group and user in the paths with your group and username).
channels:
- conda-forge
- bioconda
- defaults
envs_dirs:
- /blue/group/user/conda/envs
pkgs_dirs:
- /blue/group/user/conda/pkgs
auto_activate_base: false
auto_update_conda: false
always_yes: false
show_channel_urls: false

Create and activate a Conda environment

Expand this section to view instructions for setting up environments.

UF Research Computing Applications Team uses conda for many application installs behind the scenes. We are happy to install applications on request for you. However, if you would like to use conda to create multiple environments for your personal projects we encourage you to do so. Here are some recommendations for successful conda use on HiPerGator.

  • See https://docs.conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html for the original documentation on managing conda environments.
  • We recommend creating environments by 'path', so they won't fill up your home directory (check quota with home_quota). The resulting environment should be located in the project(s) directory tree in /blue for better tracking of installs and better filesystem performance compared to home.
If you plan on using a GPU see below

To make sure your code will run on GPUs install a recent cudatoolkit package that works with the NVIDIA drivers on HPG (currently 12.x, but older versions are still supported) alongside the pytorch or tensorflow package(s). See RC provided tensorflow or pytorch installs for examples if needed. Mamba can detect if there is a gpu in the environment, so the easiest approach is to run the mamba install command in a gpu session. Alternatively, you can run mamba install on any node or if a cpu-only pytorch package was already installed by explicitly requiring a gpu version of pytorch when running mamba install. E.g.

mamba install cudatoolkit=11.3 pytorch=1.12.1=gpu_cuda* -c pytorch

Load the conda module

Before we can run conda or mamba on HiPerGator, we need to load the conda module:

module load conda

Create your environment

Create a name based environment

Mamba create.png

To create your first name based (see path based instructions below)conda environment, run the following command. In this example, I am creating an environment named hfrl:

mamba create -n hfrl

The screenshot to the right is the output from running that command. Yours should look similar.

Note:

  1. You do not need to manually create the folders that you setup in your ~/.condarc file. mamba will take care of that for you.
  2. When creating a Conda environment you can also install Conda packages as needed at the same time. i.e:

mamba create -n hfrl python=3.9 pytorch numpy=2.22

Create a path based environment

To create a path based conda environment use the '-p PATH' argument: mamba create -p PATH e.g. mamba create -p /blue/mygroup/share/project42/conda/envs/hfrl/

Activate the new environment

To activate our environment (whether created with mamba or conda we use the conda activate env_name command. Let's activate our new environment:

conda activate hfrl

or

conda activate /blue/mygroup/share/project42/conda/envs/hfrl/

Notice that your command prompt changes when you activate an environment to indicate which environment is active, showing that in parentheses before the other information:

 (hfrl) [magitz@c0907a-s23 magitz]$ 

Note: path based environment activation is really only needed for package installation. For using the environment just add the path to its bin directory to $PATH in your job script.

Once you are done installing packages inside the environment you can use

$ conda deactivate

We do not recommend activating conda environments when _using_ them i.e. running programs installed in the environments. Please prepend the path to that environment to your $PATH instead.

E.g. If you have a project-specific conda environment at '/home/myuser/envs/project1/' add the following into your job script before executing any commands

export PATH=/home/myuser/envs/project1/bin:$PATH

Export or import an environment

Expand this section to view instructions.

Export your environment to an environment.yml file

Now that you have your environment working, you may want to document its contents and/or share it with others. The environment.yml file defines the environment and can be used to build a new environment with the same setup.

To export an environment file from an existing environment, run:

conda env export > hfrl.yml

You can inspect the contents of this file with cat hfrl.yml. This file defines the packages and versions that make up the environment as it is at this point in time. Note that it also includes packages that were installed via pip.

Create an environment from a yaml file

If you share the environment yaml file created above with another user, they can create a copy of your environment using the command:

conda env create --file hfrl.yml

They may need to edit the last line to change the location to match where they want their environment created.

Group environments

It is possible to create a shared environment accessed by a group on HiPerGator, storing the environment in, for example, /blue/group/share/conda. In general, this works best if only one user has write access to the environment. All installs should be made by that one user and should be communicated with the other users in the group. It is recommended that user's umask configuration is set to group friendly permissions, such as umask 007. See Sharing Within A Cluster.




__NOTOC__