Conda¶
Overview¶
Conda is a powerful tool for managing environments and software packages, widely used in data science and scientific computing. It allows you to create isolated environments with different package versions, facilitating compatibility between dependencies.
This guide shows how to use Conda efficiently on the HPC cluster.
Anaconda Licensing
Due to Anaconda license restrictions, we recommend using Conda through the conda-forge channel for unrestricted academic and commercial use.
Check Availability¶
# Check if Conda is available as a module
module avail conda
# Or check for Miniconda
module avail miniconda
Initial Setup¶
Option 1: Use Conda from Module¶
# Load Conda module
module load conda
# Initialize Conda (first time)
conda init bash
# Reload shell or log in again
source ~/.bashrc
Option 2: Install Personal Miniconda¶
If you prefer to have your own Conda installation:
# Download Miniconda
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
# Install (choose appropriate location)
bash Miniconda3-latest-Linux-x86_64.sh -b -p ~/miniconda3
# Initialize
~/miniconda3/bin/conda init bash
source ~/.bashrc
Configure Conda¶
Set conda-forge as default channel¶
# Add conda-forge
conda config --add channels conda-forge
# Set strict priority
conda config --set channel_priority strict
# Verify configuration
conda config --show channels
Configure package and envs directory¶
Where to create Conda environments
- Small environments (< 10 GB):
/home/$USER/miniconda3/envs/ - Large environments:
/scratch/projetos/<your_project>/conda_envs/
# Configure environment location
conda config --add envs_dirs /scratch/projetos/<my_project>/conda_envs
conda config --add pkgs_dirs /scratch/projetos/<my_project>/.conda_pkgs
# Verify configuration
conda config --show envs_dirs
conda config --show pkgs_dirs
Manage Environments¶
Create environment¶
# Create environment with specific Python
conda create --name my_env python=3.10
# Create with initial packages
conda create --name data_science python=3.10 numpy pandas matplotlib
# Create from environment.yml file
conda env create -f environment.yml
Activate/Deactivate environment¶
List environments¶
Remove environment¶
Install Packages¶
Basic installation¶
# Activate environment first
conda activate my_env
# Install package
conda install numpy
# Install multiple packages
conda install numpy pandas scipy matplotlib
# Install from specific channel
conda install -c conda-forge scikit-learn
# Install specific version
conda install python=3.11
Update packages¶
# Update specific package
conda update numpy
# Update all packages
conda update --all
# Update Conda itself
conda update conda
List packages¶
# List packages in current environment
conda list
# Search for available package
conda search tensorflow
Use Conda in SLURM Jobs¶
Simple job¶
#!/bin/bash
#SBATCH --job-name=conda_job
#SBATCH --output=/scratch/projetos/<my_project>/logs/job_%j.out
#SBATCH --error=/scratch/projetos/<my_project>/logs/job_%j.err
#SBATCH --time=02:00:00
#SBATCH --cpus-per-task=8
#SBATCH --mem=16G
# Initialize Conda
source ~/miniconda3/etc/profile.d/conda.sh
# Or if using module
# module load conda
# Activate environment
conda activate my_env
# Run script
python analysis.py
GPU job (TensorFlow/PyTorch)¶
#!/bin/bash
#SBATCH --job-name=deep_learning
#SBATCH --partition=gpu
#SBATCH --gpus=1
#SBATCH --cpus-per-task=4
#SBATCH --mem=32G
#SBATCH --time=08:00:00
#SBATCH --output=/scratch/projetos/<my_project>/logs/gpu_%j.out
# Load necessary modules
module load cuda/11.8
# Initialize Conda
source ~/miniconda3/etc/profile.d/conda.sh
# Activate environment
conda activate pytorch_env
# Run training
python train_model.py
See SLURM job examples for more options.
Export/Import Environments¶
Create environment.yml file¶
# Export current environment
conda env export > environment.yml
# Or only main dependencies
conda env export --from-history > environment.yml
Example environment.yml:
name: my_env
channels:
- conda-forge
- defaults
dependencies:
- python=3.10
- numpy=1.24.3
- pandas=2.0.3
- matplotlib=3.7.2
- scikit-learn=1.3.0
Recreate environment¶
# Create environment from file
conda env create -f environment.yml
# Or update existing environment
conda env update -f environment.yml --prune
Best Practices¶
1. Always use conda-forge¶
2. Create environments per project¶
3. Keep environment.yml updated¶
4. Clean cache regularly¶
Conda environments can take up a lot of space:
# Clean unused packages
conda clean --all
# Check size
du -sh ~/miniconda3
du -sh /scratch/projetos/<my_project>/conda_envs
5. Use mamba for faster installations¶
Mamba is a faster solver for Conda:
# Install mamba
conda install mamba -c conda-forge
# Use mamba instead of conda
mamba install numpy pandas
Common Packages¶
Data Science¶
Machine Learning¶
Bioinformatics¶
Visualization¶
Common Problems¶
Dependency conflicts¶
Problem: Conda cannot resolve dependencies.
Solution:
# Use mamba (faster)
mamba install <package>
# Or create new environment
conda create --name new_env <packages>
Environment too large¶
Problem: Environment takes up too much space.
Solution:
# Clean cache
conda clean --all
# Move to /scratch
mv ~/miniconda3/envs/my_env /scratch/projetos/<my_project>/conda_envs/
ln -s /scratch/projetos/<my_project>/conda_envs/my_env ~/miniconda3/envs/my_env
Conda slow¶
Problem: Package installation very slow.
Solution:
# Install and use mamba
conda install mamba -c conda-forge
mamba install <package>
# Or disable automatic updates
conda config --set auto_update_conda false
Conflict with modules¶
Problem: Conflict between Conda and system modules.
Solution:
Conda vs Python venv¶
| Aspect | Conda | Python venv |
|---|---|---|
| Packages | Python + Non-Python (C, R, etc.) | Python only |
| Management | Environments + Packages | Environments only |
| Size | Larger | Smaller |
| Speed | Slower | Faster |
| Recommended use | Data science, ML | Pure Python development |
See also the Python guide for comparison.
Additional Resources¶
Support¶
If you encounter problems with Conda:
- Check the documentation:
conda --help - See our support page
- Contact us: hpc@fieb.org.br