Skip to content

Conda

Overview

Conda is a powerful tool for managing environments and software packages, widely used in data science and scientific computing. It allows you to create isolated environments with different package versions, facilitating compatibility between dependencies.

This guide shows how to use Conda efficiently on the HPC cluster.

Anaconda Licensing

Due to Anaconda license restrictions, we recommend using Conda through the conda-forge channel for unrestricted academic and commercial use.

Check Availability

# Check if Conda is available as a module
module avail conda

# Or check for Miniconda
module avail miniconda

Initial Setup

Option 1: Use Conda from Module

# Load Conda module
module load conda

# Initialize Conda (first time)
conda init bash

# Reload shell or log in again
source ~/.bashrc

Option 2: Install Personal Miniconda

If you prefer to have your own Conda installation:

# Download Miniconda
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh

# Install (choose appropriate location)
bash Miniconda3-latest-Linux-x86_64.sh -b -p ~/miniconda3

# Initialize
~/miniconda3/bin/conda init bash
source ~/.bashrc

Configure Conda

Set conda-forge as default channel

# Add conda-forge
conda config --add channels conda-forge

# Set strict priority
conda config --set channel_priority strict

# Verify configuration
conda config --show channels

Configure package and envs directory

Where to create Conda environments

  • Small environments (< 10 GB): /home/$USER/miniconda3/envs/
  • Large environments: /scratch/projetos/<your_project>/conda_envs/
# Configure environment location
conda config --add envs_dirs /scratch/projetos/<my_project>/conda_envs
conda config --add pkgs_dirs /scratch/projetos/<my_project>/.conda_pkgs

# Verify configuration
conda config --show envs_dirs
conda config --show pkgs_dirs

Manage Environments

Create environment

# Create environment with specific Python
conda create --name my_env python=3.10

# Create with initial packages
conda create --name data_science python=3.10 numpy pandas matplotlib

# Create from environment.yml file
conda env create -f environment.yml

Activate/Deactivate environment

# Activate
conda activate my_env

# Deactivate
conda deactivate

List environments

# List all environments
conda env list

# Or
conda info --envs

Remove environment

conda env remove --name my_env

Install Packages

Basic installation

# Activate environment first
conda activate my_env

# Install package
conda install numpy

# Install multiple packages
conda install numpy pandas scipy matplotlib

# Install from specific channel
conda install -c conda-forge scikit-learn

# Install specific version
conda install python=3.11

Update packages

# Update specific package
conda update numpy

# Update all packages
conda update --all

# Update Conda itself
conda update conda

List packages

# List packages in current environment
conda list

# Search for available package
conda search tensorflow

Use Conda in SLURM Jobs

Simple job

#!/bin/bash
#SBATCH --job-name=conda_job
#SBATCH --output=/scratch/projetos/<my_project>/logs/job_%j.out
#SBATCH --error=/scratch/projetos/<my_project>/logs/job_%j.err
#SBATCH --time=02:00:00
#SBATCH --cpus-per-task=8
#SBATCH --mem=16G

# Initialize Conda
source ~/miniconda3/etc/profile.d/conda.sh

# Or if using module
# module load conda

# Activate environment
conda activate my_env

# Run script
python analysis.py

GPU job (TensorFlow/PyTorch)

#!/bin/bash
#SBATCH --job-name=deep_learning
#SBATCH --partition=gpu
#SBATCH --gpus=1
#SBATCH --cpus-per-task=4
#SBATCH --mem=32G
#SBATCH --time=08:00:00
#SBATCH --output=/scratch/projetos/<my_project>/logs/gpu_%j.out

# Load necessary modules
module load cuda/11.8

# Initialize Conda
source ~/miniconda3/etc/profile.d/conda.sh

# Activate environment
conda activate pytorch_env

# Run training
python train_model.py

See SLURM job examples for more options.

Export/Import Environments

Create environment.yml file

# Export current environment
conda env export > environment.yml

# Or only main dependencies
conda env export --from-history > environment.yml

Example environment.yml:

name: my_env
channels:
  - conda-forge
  - defaults
dependencies:
  - python=3.10
  - numpy=1.24.3
  - pandas=2.0.3
  - matplotlib=3.7.2
  - scikit-learn=1.3.0

Recreate environment

# Create environment from file
conda env create -f environment.yml

# Or update existing environment
conda env update -f environment.yml --prune

Best Practices

1. Always use conda-forge

# Configure once
conda config --add channels conda-forge
conda config --set channel_priority strict

2. Create environments per project

/scratch/projetos/<my_project>/conda_envs/
├── production/
├── development/
└── testing/

3. Keep environment.yml updated

# Whenever installing new packages
conda env export --from-history > environment.yml

4. Clean cache regularly

Conda environments can take up a lot of space:

# Clean unused packages
conda clean --all

# Check size
du -sh ~/miniconda3
du -sh /scratch/projetos/<my_project>/conda_envs

5. Use mamba for faster installations

Mamba is a faster solver for Conda:

# Install mamba
conda install mamba -c conda-forge

# Use mamba instead of conda
mamba install numpy pandas

Common Packages

Data Science

conda install numpy pandas scipy matplotlib seaborn

Machine Learning

conda install scikit-learn tensorflow pytorch torchvision

Bioinformatics

conda install bioconda::biopython bioconda::samtools

Visualization

conda install plotly bokeh altair

Common Problems

Dependency conflicts

Problem: Conda cannot resolve dependencies.

Solution:

# Use mamba (faster)
mamba install <package>

# Or create new environment
conda create --name new_env <packages>

Environment too large

Problem: Environment takes up too much space.

Solution:

# Clean cache
conda clean --all

# Move to /scratch
mv ~/miniconda3/envs/my_env /scratch/projetos/<my_project>/conda_envs/
ln -s /scratch/projetos/<my_project>/conda_envs/my_env ~/miniconda3/envs/my_env

Conda slow

Problem: Package installation very slow.

Solution:

# Install and use mamba
conda install mamba -c conda-forge
mamba install <package>

# Or disable automatic updates
conda config --set auto_update_conda false

Conflict with modules

Problem: Conflict between Conda and system modules.

Solution:

# Unload all modules before using Conda
module purge

# Then activate Conda
conda activate my_env

Conda vs Python venv

Aspect Conda Python venv
Packages Python + Non-Python (C, R, etc.) Python only
Management Environments + Packages Environments only
Size Larger Smaller
Speed Slower Faster
Recommended use Data science, ML Pure Python development

See also the Python guide for comparison.

Additional Resources

Support

If you encounter problems with Conda: