Difference between revisions of "Conda"

From UFRC
Jump to navigation Jump to search
Line 32: Line 32:
 
{{#if: {{#var: conf}}|==Configuration==
 
{{#if: {{#var: conf}}|==Configuration==
 
<!-- See the [[{{PAGENAME}}_Configuration]] page for {{#var: app}} configuration details. -->
 
<!-- See the [[{{PAGENAME}}_Configuration]] page for {{#var: app}} configuration details. -->
'''Note:''' If you are an active conda user conda package cache can quickly fill up your home directory. We recommend changing the <code>pkgs_dirs</code> setting in ~/.condarc to point to another filesystem e.g.
+
=== The <code>~/.condarc</code> configuration file ===
pkgs_dirs:
+
 
    - /blue/gators/albert/conda/pkgs
+
<code>conda</code>'s behavior is controlled by a configuration file in your home directory called <code>.condarc</code>. The dot at the start of the name means that the file is hidden from 'ls' file listing command by default. If you have not run <code>conda</code> before, you won't have this file. Whether the file exists or not, the steps here will help you modify the file to work best on HiPerGator. First load of the <code>conda</code> environment module on HiPerGator will put the current ''best practice'' <code>.condarc</code> into your home directory.
 +
 
 +
=== <code>conda</code> package cache location ===
 +
 
 +
<code>conda</code> caches (keeps a copy) of all downloaded packages by default in the <code>~/.conda/pkgs</code> directory tree. If you install a lot of packages you may end up filling up your home quota. You can change the default package cache path. To do so, add or change the <code>pkgs_dirs</code> setting in your <code>~/.condarc</code> configuration file e.g.:
 +
 
 +
 
 +
<pre>
 +
pkgs_dirs:
 +
  - /blue/mygroup/share/conda/pkgs
 +
</pre>
 +
or
 +
<pre>
 +
  - /blue/mygroup/$USER/conda/pkgs
 +
</pre>
 +
 
 +
Replace <code>mygroup</code> with your actual group name.
 +
 
 +
=== <code>conda</code> environment location ===
 +
 
 +
<code>conda</code> puts all packages installed in a particular environment into a single directory. By default ''named'' <code>conda</code> environments are created in the <code>~/.conda/envs</code> directory tree. They can quickly grow in size and, especially if you have many environments, fill the 40GB home directory quota. For example, the environment we will create in this training is 5.3GB in size. As such, it is important to use ''path'' based (conda create -p PATH) conda environments, which allow you to use any path for a particular environment for example allowing you to keep a project-specific conda environment close to the project data in </code>/blue/</code> where you group has terrabyte(s) of space.
 +
 
 +
You can also change the default path for the ''name'' environments (<code>conda create -n NAME</code>) if you prefer to keep all <code>conda</code> environments in the same directory tree. To do so, add or change the <code>envs_dirs</code> setting in the <code>~/.condarc</code> configuration file e.g.:
 +
 
 +
<pre>
 +
envs_dirs:
 +
  - /blue/mygroup/share/conda/envs
 +
</pre>
 +
or
 +
<pre>
 +
  - /blue/mygroup/$USER/conda/envs
 +
</pre>
 +
 
 +
Replace <code>mygroup</code> with your actual group name.
 +
 
 +
<div class="mw-collapsible mw-collapsed" style="width:70%; padding: 5px; border: 1px solid gray;">
 +
''Expand this section to view instructions for editing your <code>~/.condarc</code> file.''
 +
<div class="mw-collapsible-content" style="padding: 5px;">
 +
One way to edit your <code>~/.condarc</code> file is to type: <code>nano ~/.condarc`</code>
 +
 
 +
If the file is empty, paste in the text below, editing the <code>env_dirs:</code> and <code>pkg_dirs</code> as below. If the file has contents, update those lines.
 +
 
 +
{{Note|Your <code>~/.condarc</code> should look something like this when you are done editing (again, replacing <code>group</code> and <code>user</code> in the paths with your group and username).|note}}
  
Here is an example ~/.condarc. Change 'GROUP' and USER to the name of your group and username or type the exact path.
 
 
<pre>
 
<pre>
 
channels:
 
channels:
  - conda-forge
+
- conda-forge
  - bioconda
+
- bioconda
  - defaults
+
- defaults
 
envs_dirs:
 
envs_dirs:
  - /blue/GROUP/USER/conda/envs
+
- /blue/group/user/conda/envs
 
pkgs_dirs:
 
pkgs_dirs:
  - /blue/GROUP/USER/conda/pkgs
+
- /blue/group/user/conda/pkgs
 
auto_activate_base: false
 
auto_activate_base: false
 
auto_update_conda: false
 
auto_update_conda: false
Line 51: Line 92:
 
show_channel_urls: false
 
show_channel_urls: false
 
</pre>
 
</pre>
 +
</div>
 +
</div>
 +
<div class="mw-collapsible-content" style="padding: 5px;">
 
|}}
 
|}}
 
<!--Run-->
 
<!--Run-->

Revision as of 22:55, 28 January 2023

For help on creating and managing personal environments whether for command-tool use or python package use in SLURM jobs or Jupyter kernels see Managing Python environments and Jupyter kernels

Description

conda website  

Conda is an open-source package management system and environment management system that runs on Windows, macOS, and Linux. Conda quickly installs, runs, and updates packages and their dependencies. Conda easily creates, saves, loads, and switches between environments. Separating applications in separate conda environments allows installation of incompatible dependencies - python2 and python3 for example.

Note: For a faster conda see Mamba.

Environment Modules

Run module spider conda to find out what environment modules are available for this application.

System Variables

  • HPC_CONDA_DIR - installation directory
  • HPC_CONDA_BIN - executable directory

Configuration

The ~/.condarc configuration file

conda's behavior is controlled by a configuration file in your home directory called .condarc. The dot at the start of the name means that the file is hidden from 'ls' file listing command by default. If you have not run conda before, you won't have this file. Whether the file exists or not, the steps here will help you modify the file to work best on HiPerGator. First load of the conda environment module on HiPerGator will put the current best practice .condarc into your home directory.

conda package cache location

conda caches (keeps a copy) of all downloaded packages by default in the ~/.conda/pkgs directory tree. If you install a lot of packages you may end up filling up your home quota. You can change the default package cache path. To do so, add or change the pkgs_dirs setting in your ~/.condarc configuration file e.g.:


pkgs_dirs:
  - /blue/mygroup/share/conda/pkgs

or

  - /blue/mygroup/$USER/conda/pkgs

Replace mygroup with your actual group name.

conda environment location

conda puts all packages installed in a particular environment into a single directory. By default named conda environments are created in the ~/.conda/envs directory tree. They can quickly grow in size and, especially if you have many environments, fill the 40GB home directory quota. For example, the environment we will create in this training is 5.3GB in size. As such, it is important to use path based (conda create -p PATH) conda environments, which allow you to use any path for a particular environment for example allowing you to keep a project-specific conda environment close to the project data in /blue/ where you group has terrabyte(s) of space.

You can also change the default path for the name environments (conda create -n NAME) if you prefer to keep all conda environments in the same directory tree. To do so, add or change the envs_dirs setting in the ~/.condarc configuration file e.g.:

envs_dirs:
  - /blue/mygroup/share/conda/envs

or

  - /blue/mygroup/$USER/conda/envs

Replace mygroup with your actual group name.

Expand this section to view instructions for editing your ~/.condarc file.

One way to edit your ~/.condarc file is to type: nano ~/.condarc`

If the file is empty, paste in the text below, editing the env_dirs: and pkg_dirs as below. If the file has contents, update those lines.

Your ~/.condarc should look something like this when you are done editing (again, replacing group and user in the paths with your group and username).
channels:
- conda-forge
- bioconda
- defaults
envs_dirs:
- /blue/group/user/conda/envs
pkgs_dirs:
- /blue/group/user/conda/pkgs
auto_activate_base: false
auto_update_conda: false
always_yes: false
show_channel_urls: false

Additional Information

UF Research Computing Applications Team uses conda for many application installs behind the scenes. We are happy to install applications on request for you. However, if you would like to use conda to create multiple environments for your personal projects we encourage you to do so. Here are some recommendations for successful conda use on HiPerGator.

  • We suggest not installing your own copy of miniconda or anaconda, but using one of our conda or mamba environment modules (module load conda) to get the 'conda engine' for free. A personal miniconda3 install will overwrite your ~/.bashrc into your ~/.conda/envs directory, which could fill up your home_quota very fast.

See https://docs.conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html for the original documentation on managing conda environments.

We recommend using '-p' argument to create environments by 'path', so they wouldn't fill up your home directory (check quota with home_quota). The resulting environment should be located in the project(s) directory tree in /blue for better tracking of installs and better filesystem performance compared to home.

Use the following command instead of 'conda create -n name_of_the_env'

conda create -yp /path/to/the/environment

There is a clear distinction between creating or modifying conda environments and using applications installed in those environments. To use applications installed in a conda environment in most cases you only need to add its 'bin' directory to the $PATH whether in an interactive session or a job script submitted to the scheduler. In rare cases LD_LIBRARY_PATH variable also needs to be set to the 'lib' sub-directory of the conda environment.

To modify a conda environment by installing or removing packages you need to 'activate' the environment first with a 'conda activate /path/to/the/conda/env' command.

Because of that distinction we strongly recommend against installing miniconda manually and allowing it to insert its activation code into your shell initialization file ~/.bashrc. If you already let it do so please remove the offending code from ~/.bashrc. It's the text within and including the following lines:

# >>> conda initialize >>>
...
# <<< conda initialize <<<

Instead, load the conda environment module with 'module load conda' when needed to create or modify conda environments, but only set PATH when using the environment as shown above. E.g.

$ module load conda
$ conda activate /path/to/the/environment

Once you are done installing packages inside the environment you can use

$ conda deactivate

We do not recommend activating conda environments when _using_ them i.e. running programs installed in the environments. Please prepend the path to that environment to your $PATH instead.

E.g. If you have a project-specific conda environment at '/home/myuser/envs/project1/' add the following into your job script before executing any commands

export PATH=/home/myuser/envs/project1/bin:$PATH