When transferring scientific calculation results between servers, we need a solution that:
- Securely copies files via SSH
- Ensures complete data transfer
- Automatically removes source files after successful transfer
- Handles naming conflicts intelligently
Here's the fundamental rsync command for your scenario:
rsync -avz --remove-source-files -e ssh user@serverA:/process/calc1/ user@serverB:/process/
Rsync provides several approaches for conflict resolution:
Option 1: Skip existing files
rsync -avz --ignore-existing --remove-source-files -e ssh user@serverA:/process/calc1/ user@serverB:/process/
Option 2: Auto-renaming with suffix
For automatic renaming (calc1, calc1R2, calc1R3, etc.), use this script:
#!/bin/bash
SOURCE="/process/calc1"
DEST="user@serverB:/process/"
BASE="calc1"
COUNT=1
TARGET="$BASE"
while ssh user@serverB test -d "/process/$TARGET"; do
COUNT=$((COUNT+1))
TARGET="${BASE}R${COUNT}"
done
rsync -avz --remove-source-files -e ssh "user@serverA:$SOURCE/" "user@serverB:/process/$TARGET/"
For more complex scenarios, consider these rsync options:
- --backup: Rename existing files with ~ suffix
- --backup-dir=DIR: Store conflicts in separate directory
- --suffix=SUFFIX: Customize the backup suffix
Always include verification steps:
rsync -avz --checksum --remove-source-files -e ssh user@serverA:/process/calc1/ user@serverB:/process/
Add error checking to your script:
if ! rsync -avz --remove-source-files -e ssh user@serverA:/process/calc1/ user@serverB:/process/; then
echo "Transfer failed!" >&2
exit 1
fi
For recurring transfers, create a monitoring script:
#!/bin/bash
SOURCE_DIR="/process"
DEST="user@serverB:/process/"
inotifywait -m -e create -e moved_to --format "%w%f" "$SOURCE_DIR" | while read NEWDIR
do
if [ -d "$NEWDIR" ]; then
BASENAME=$(basename "$NEWDIR")
./transfer_script.sh "$NEWDIR" "$DEST$BASENAME"
fi
done
When dealing with scientific computing workflows, we often need to transfer calculation results between servers while maintaining data integrity. The key requirements include:
- Secure transfer via SSH
- Verification of complete file copies
- Source file cleanup after transfer
- Automated conflict resolution for duplicate filenames
Here's the fundamental rsync command for your scenario:
rsync -avz --remove-source-files -e ssh user@serverA:/process/calc1/ user@serverB:/process/
The flags used:
- -a: Archive mode (preserves permissions, ownership, timestamps)
- -v: Verbose output
- -z: Compression during transfer
- --remove-source-files: Delete source files after successful transfer
- -e ssh: Specify SSH as the remote shell
rsync doesn't have built-in versioning for directory conflicts, but we can implement solutions:
Option 1: Timestamp-based Renaming
rsync -avz --remove-source-files --backup --suffix=$(date +'_%Y%m%d_%H%M%S') -e ssh user@serverA:/process/calc1/ user@serverB:/process/
Option 2: Sequential Numbering
For more controlled versioning, use a script:
#!/bin/bash source_dir="/process/calc1" dest_base="/process" counter=1 while [ -d "${dest_base}/calc1" ]; do if [ -d "${dest_base}/calc1R${counter}" ]; then ((counter++)) else rsync -avz --remove-source-files -e ssh user@serverA:${source_dir}/ user@serverB:${dest_base}/calc1R${counter}/ exit $? fi done rsync -avz --remove-source-files -e ssh user@serverA:${source_dir}/ user@serverB:${dest_base}/calc1/
To ensure complete transfers and proper error handling:
rsync -avz --remove-source-files --checksum --progress --stats -e ssh user@serverA:/process/calc1/ user@serverB:/process/ && \ ssh user@serverA "find /process/calc1 -type f -empty -delete" && \ ssh user@serverA "rmdir /process/calc1" 2>/dev/null || true
For regular transfers, set up a cron job with proper logging:
0 * * * * /usr/bin/flock -n /tmp/rsync_process.lock /path/to/your/transfer_script.sh >> /var/log/process_transfer.log 2>&1
For more complex scenarios, consider:
- Using unison for bidirectional sync with conflict resolution
- Implementing a custom solution with Python's paramiko and shutil libraries
- Setting up a message queue system for distributed file operations