Data Transfer¶
Overview¶
Transferring data between the HPC cluster and your local computer is an essential operation in high-performance computing workflows. This guide presents the most common methods for securely and efficiently transferring files.
Prerequisites
Before transferring data, make sure you:
- Have an SSH connection configured
- Know the cluster address (e.g.,
ogun-login.senaicimatec.com.br) - Know which directory to use on the cluster (see File Management)
Transfer Methods¶
SCP (Secure Copy)¶
scp is the simplest tool for transferring files via SSH. It works similarly to the cp command but allows copying between different machines.
Download file from cluster to your computer¶
Example:
Download entire directory (recursive)¶
Example:
Upload file from your computer to the cluster¶
Example:
Upload directory to the cluster¶
Custom SSH port
If the cluster uses a different SSH port than the default (22), use the -P option:
Rsync (Recommended for large transfers)¶
rsync is more efficient than scp for large transfers or when you need to synchronize directories. It transfers only the differences between files, saving time and bandwidth.
Download data from cluster¶
Options explained:
-a: archive mode (preserves permissions, timestamps, etc.)-v: verbose (shows details)--progress: shows transfer progress
Upload data to the cluster¶
Synchronization with deletions¶
If you've deleted files locally and want them deleted on the destination as well:
Caution with --delete
The --delete option removes files on the destination that don't exist in the source. Use with care!
Transfer with pattern exclusions¶
rsync -av --progress --exclude='*.tmp' --exclude='__pycache__' ~/project/ <username>@<cluster>:/scratch/projetos/<my_project>/
SFTP (Interactive Transfer)¶
sftp provides an interactive interface similar to FTP, but secure via SSH.
Connect to cluster via SFTP¶
Basic SFTP commands¶
After connecting, you can use the following commands:
# List files on the cluster
ls
# List files on your local computer
lls
# Change directory on the cluster
cd /scratch/projetos/<my_project>/
# Change local directory
lcd ~/Downloads/
# Download file
get file.txt
# Download directory (recursive)
get -r folder/
# Upload file
put my_file.txt
# Upload directory
put -r my_folder/
# Exit
exit
Example session:
$ sftp john@ogun-login.senaicimatec.com.br
sftp> cd /scratch/projetos/climate_analysis/
sftp> ls
results_2024/ data/ scripts/
sftp> get -r results_2024/
Fetching /scratch/projetos/climate_analysis/results_2024/ to results_2024
sftp> exit
Graphical Clients¶
For users who prefer graphical interfaces, several options are available:
Windows¶
- MobaXterm: Integrated interface with file browser (recommended)
- WinSCP: Dedicated SFTP/SCP client
- FileZilla: Supports SFTP
macOS¶
Linux¶
- FileZilla: Available on most distributions
- Native file managers (Nautilus, Dolphin) with SFTP support via
sftp://
Best Practices¶
Data Organization¶
- Use appropriate directories:
/scratch/projetos/<project>/for project job input/output data/home/$USER/only for small personal scripts and configurations-
See File Management for details on organization
-
Organize by project:
Transfer Optimization¶
-
Compress large files before transferring:
-
Use rsync to resume interrupted transfers:
The--partialoption keeps partially transferred files to resume later. -
Transfer during off-peak hours when possible.
Integrity Verification¶
After important transfers, verify data integrity:
# Generate checksum on the cluster
md5sum file.dat > file.md5
# Transfer both
scp <username>@<cluster>:/scratch/projetos/<my_project>/file.* ~/
# Verify locally
md5sum -c file.md5
Limitations and Quotas¶
Respect storage quotas
Each directory has storage limits. Contact hpc@fieb.org.br for information about quotas.
- Bandwidth: Very large transfers can affect other users. Be considerate.
- Connection timeout: SSH connections can expire. Use tools like
screenortmuxfor long sessions.
Checking Available Space¶
Before transferring data, check available space:
# Check disk usage in a project directory
du -sh /scratch/projetos/<my_project>/
# Check quota (if available)
quota -s
# Available space on filesystem
df -h /scratch
Common Problems¶
Permission denied¶
If you receive "Permission denied":
- Verify that your SSH key is correctly configured
- Confirm that you have write permission in the destination directory
- Check if you haven't exceeded your storage quota
Slow transfer¶
If transfers are slow:
- Try compressing data before transferring
- Use
rsyncwith compression (-z) - Check your internet connection
- Avoid peak hours if possible
Interrupted connection¶
For long transfers that may be interrupted:
# Use rsync with --partial to resume
rsync -avz --progress --partial <source> <destination>
# Or use screen/tmux on the server
screen
rsync -avz --progress <source> <destination>
# Press Ctrl+A, then D to detach
# Reconnect later with: screen -r
Support¶
If you encounter problems transferring data, see our support page or contact hpc@fieb.org.br.