Conda: Difference between revisions
Moskalenko (talk | contribs) No edit summary |
|||
(7 intermediate revisions by 2 users not shown) | |||
Line 21: | Line 21: | ||
Conda is an open-source package management system and environment management system that runs on Windows, macOS, and Linux. Conda quickly installs, runs, and updates packages and their dependencies. Conda easily creates, saves, loads, and switches between environments. Separating applications in separate conda environments allows installation of incompatible dependencies - python2 and python3 for example. | Conda is an open-source package management system and environment management system that runs on Windows, macOS, and Linux. Conda quickly installs, runs, and updates packages and their dependencies. Conda easily creates, saves, loads, and switches between environments. Separating applications in separate conda environments allows installation of incompatible dependencies - python2 and python3 for example. | ||
''' | '''Notes:''' | ||
* For a ''faster conda'' see [[Mamba]]. | |||
* Do not use the 'Anaconda Distribution' to install conda/mamba. Use an open source [https://github.com/conda-forge/miniforge MiniForge] installer or, better yet, our 'conda' environment module and open source repositories such as conda-forge and bioconda. We provide a 'known good' ~/.condarc configuration script that gets installed on the first 'module load conda' run. | |||
<!--Modules--> | <!--Modules--> | ||
Line 41: | Line 43: | ||
* Pip by default installs binary packages (wheels), which are often built on systems incompatible with HiPerGator. This can lead to importing errors, and its attempts to build from source will fail without additional configuration. | * Pip by default installs binary packages (wheels), which are often built on systems incompatible with HiPerGator. This can lead to importing errors, and its attempts to build from source will fail without additional configuration. | ||
* If you pip installing a package that is/will be installed in an environment provided by UFRC, your pip version will take precedence. Your dependencies eventually become incompatible causing errors, with even one pip install making environments unusable. | * If you are pip installing a package that is/will be installed in an environment provided by UFRC, your pip version will take precedence. Your dependencies eventually become incompatible causing errors, with even one pip install making environments unusable. | ||
* Different packages may require different versions of the same package as dependencies leading to impossible to reconcile installation scenarios. This becomes a challenge to manage with <code>pip</code> as there isn't a method to swap active versions. | * Different packages may require different versions of the same package as dependencies leading to impossible to reconcile installation scenarios. This becomes a challenge to manage with <code>pip</code> as there isn't a method to swap active versions. | ||
* On its own, `pip` installs **everything** in one location: <code>~/.local/lib/python3.X/site-packages/</code>. | * On its own, `pip` installs **everything** in one location: <code>~/.local/lib/python3.X/site-packages/</code>. | ||
Line 50: | Line 52: | ||
<big>'''A caveat'''</big> | <big>'''A caveat'''</big> | ||
<code>conda</code> and <code>mamba</code> get packages from channels, or repositories of prebuilt | <code>conda</code> and <code>mamba</code> get packages from channels, or repositories of prebuilt packages. While there are several available channels, like the <code>conda-forge</code> or <code>bioconda</code>, not every Python package is available from such channel as they have to be packaged for conda first. You may still need to use <code>pip</code> to install some packages as noted later. '''However, <code>conda</code> still helps manage environment by installing packages into separate directory trees rather than trying to install all packages into a single folder that pip does.''' | ||
</div></div> | </div></div> | ||
Line 67: | Line 69: | ||
<pre> | <pre> | ||
pkgs_dirs: | pkgs_dirs: | ||
- /blue/mygroup/$USER/conda/pkgs | |||
</pre> | </pre> | ||
Replace <code>mygroup</code> with your actual group name. | Replace <code>mygroup</code> with your actual group name. | ||
Line 115: | Line 116: | ||
''Expand this section to view instructions for setting up environments.'' | ''Expand this section to view instructions for setting up environments.'' | ||
<div class="mw-collapsible-content" style="padding: 5px;"> | <div class="mw-collapsible-content" style="padding: 5px;"> | ||
UF Research Computing Applications Team uses conda for many application installs behind the scenes. We are happy to [https://support.rc.ufl.edu install applications on request] for you. However, if you would like to use conda to create multiple environments for your personal projects we encourage you to do so. Here are some recommendations for successful conda use on HiPerGator. | UF Research Computing Applications Team uses conda for many application installs behind the scenes. We are happy to [https://support.rc.ufl.edu install applications on request] for you. However, if you would like to use conda to create multiple environments for your personal projects we encourage you to do so. Here are some recommendations for successful conda use on HiPerGator. | ||
*See [https://docs.conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html https://docs.conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html] for the original documentation on managing conda environments. | *See [https://docs.conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html https://docs.conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html] for the original documentation on managing conda environments. | ||
Line 121: | Line 121: | ||
{{Note|'''If you plan on using a GPU''' see below|warn}} | {{Note|'''If you plan on using a GPU''' see below|warn}} | ||
To make sure your code will run on GPUs install a recent <code>cudatoolkit</code> package that works with the NVIDIA drivers on HPG (currently | To make sure your code will run on GPUs install a recent <code>cudatoolkit</code> package that works with the NVIDIA drivers on HPG (currently 12.x, but older versions are still supported) alongside the pytorch or tensorflow package(s). See RC provided tensorflow or pytorch installs for examples if needed. Mamba can detect if there is a gpu in the environment, so the easiest approach is to run the mamba install command in a gpu session. Alternatively, you can run mamba install on any node or if a cpu-only pytorch package was already installed by explicitly requiring a gpu version of pytorch when running mamba install. E.g. | ||
mamba install cudatoolkit=11. | mamba install cudatoolkit=11.3 pytorch=1.12.1=gpu_cuda* -c pytorch | ||
'''<big>Load the <code>conda</code> module</big>''' | '''<big>Load the <code>conda</code> module</big>''' | ||
Line 129: | Line 129: | ||
module load conda | module load conda | ||
<big>'''Create your | <big>'''Create your environment'''</big> | ||
'''Create a ''name based'' environment''' | '''Create a ''name based'' environment''' | ||
[[File:Mamba create.png|frameless|right|250px]] | |||
To create your first ''name based'' (see path based instructions below)<code>conda</code> environment, run the following command. In this example, I am creating an environment named <code>hfrl</code>: | To create your first ''name based'' (see path based instructions below)<code>conda</code> environment, run the following command. In this example, I am creating an environment named <code>hfrl</code>: | ||
Line 139: | Line 140: | ||
The screenshot to the right is the output from running that command. Yours should look similar. | The screenshot to the right is the output from running that command. Yours should look similar. | ||
'''Note: You do not need to manually create the folders that you setup in your <code>~/.condarc</code> file. <code>mamba</code> will take care of that for you.''' | '''Note: | ||
# You do not need to manually create the folders that you setup in your <code>~/.condarc</code> file. <code>mamba</code> will take care of that for you. | |||
# When creating a Conda environment you can also install Conda packages as needed at the same time. i.e: | |||
<code>mamba create -n hfrl python=3.9 pytorch numpy=2.22</code>''' | |||
'''Create a ''path based'' environment''' | '''Create a ''path based'' environment''' | ||
Line 200: | Line 204: | ||
== Group environments == | == Group environments == | ||
It is possible to create a shared environment accessed by a group on HiPerGator, storing the environment in, for example, <code>/blue/group/share/conda</code>. In general, this works best if only one user has write access to the environment. All installs should be made by that one user and should be communicated with the other users in the group. | It is possible to create a shared environment accessed by a group on HiPerGator, storing the environment in, for example, <code>/blue/group/share/conda</code>. In general, this works best if only one user has write access to the environment. All installs should be made by that one user and should be communicated with the other users in the group. It is recommended that user's umask configuration is set to group friendly permissions, such as umask 007. See [[Sharing Within A Cluster]]. | ||
</onlyinclude> | </onlyinclude> | ||
|}} | |}} |
Latest revision as of 19:22, 23 August 2024
For help on creating and managing personal environments whether for command-tool use or python package use in SLURM jobs or Jupyter kernels see Managing Python environments and Jupyter kernels
Description
Conda is an open-source package management system and environment management system that runs on Windows, macOS, and Linux. Conda quickly installs, runs, and updates packages and their dependencies. Conda easily creates, saves, loads, and switches between environments. Separating applications in separate conda environments allows installation of incompatible dependencies - python2 and python3 for example.
Notes:
- For a faster conda see Mamba.
- Do not use the 'Anaconda Distribution' to install conda/mamba. Use an open source MiniForge installer or, better yet, our 'conda' environment module and open source repositories such as conda-forge and bioconda. We provide a 'known good' ~/.condarc configuration script that gets installed on the first 'module load conda' run.
Environment Modules
Run module spider conda
to find out what environment modules are available for this application.
System Variables
- HPC_CONDA_DIR - installation directory
- HPC_CONDA_BIN - executable directory
Background
Many projects that use Python code require careful management of the respective Python environments. Rapid changes in package dependencies, package version conflicts, deprecation of APIs (function calls) by individual projects, and obsolescence of system drivers and libraries make it virtually impossible to use an arbitrary set of packages or create one all-encompassing environment that will serve everyone's needs over long periods of time. The high velocity of changes in the popular ML/DL frameworks and packages and GPU computing exacerbates the problem.
The problem with pip install
Expand this section to view pip problems and how conda/mamba mends them.
Configuration
Expand this section to view instructions for configuring Conda
Create and activate a Conda environment
Expand this section to view instructions for setting up environments.
Export or import an environment
Expand this section to view instructions.
Group environments
It is possible to create a shared environment accessed by a group on HiPerGator, storing the environment in, for example, /blue/group/share/conda
. In general, this works best if only one user has write access to the environment. All installs should be made by that one user and should be communicated with the other users in the group. It is recommended that user's umask configuration is set to group friendly permissions, such as umask 007. See Sharing Within A Cluster.
__NOTOC__