Optimizing FTP Mirroring: Parallel Download Techniques for Speed Improvement


4 views

Using wget -m for FTP mirroring is reliable but painfully slow for large sites. The sequential nature of wget means it downloads files one after another, leaving bandwidth underutilized. Modern solutions should leverage parallel connections and connection reuse.

For efficient FTP mirroring, consider these tools that support parallel downloads:

# lftp example (best for parallel FTP)
lftp -e "mirror --parallel=8 --use-pget-n=5 /remote/path /local/path" ftp://example.com

# aria2 example (supports FTP and HTTP)
aria2c --ftp-pasv=true --max-connection-per-server=8 --split=8 ftp://example.com/path/*
Tool Parallel Connections Resume Support Speed
wget 1 Yes Slow
lftp Configurable Yes Fast
aria2 Configurable Yes Very Fast

For complete mirroring with parallel downloads and bandwidth optimization:

# Full-featured mirror script
lftp -u username,password ftp://example.com << EOF
set net:connection-limit 8
set pget:default-n 5
mirror -c --parallel=8 /remote_dir /local_dir
quit
EOF

For secure FTP connections with parallel downloads:

# Using lftp with FTPS
lftp -e "set ftp:ssl-force true; set ftp:ssl-protect-data true; \
mirror --parallel=8 --user=username --password=password /path" ftps://example.com

Create a cron job for scheduled mirroring with parallel downloads:

# Weekly mirror with email notification
0 3 * * 1 lftp -e "mirror --parallel=8 --only-newer /remote /local; \
quit" ftp://user:pass@example.com | mail -s "Mirror Complete" admin@example.com
  • Adjust parallel connections based on server limits (start with 4-8)
  • Use --parallel=4 --use-pget-n=3 in lftp for large files
  • Set net:connection-limit to prevent overloading the server
  • Consider --only-newer or --only-missing for updates

When using wget -m ftp://example.com, the process downloads files sequentially - a major bottleneck for large repositories. Each file transfer must complete before the next begins, underutilizing available bandwidth.

For modern FTP mirroring, consider these parallel download tools:

1. lftp - The FTP Power Tool

lftp supports parallel transfers with segmented downloads:

lftp -e "mirror --parallel=10 --use-pget-n=5 /remote/dir /local/dir" ftp://user:pass@host

Key options:
--parallel: Number of concurrent connections (try 5-10)
--use-pget-n: Segments per file (3-5 works well)

2. aria2 - The Download Accelerator

aria2 handles FTP with multi-connection support:

aria2c --ftp-user=USER --ftp-pass=PASS \
--split=10 --max-connection-per-server=16 \
--dir=/local/path --force-sequential=false \
ftp://host/path/*

Pro tip: Combine with --input-file to process file lists.

3. GNU Parallel + wget

For existing wget scripts, add parallelism:

wget -m --ftp-user=USER --ftp-password=PASS \
--no-remove-listing ftp://host/path
find . -type f | parallel -j 16 wget -nc -c {}

Connection Tuning: Adjust TCP window sizes (sysctl -w net.ipv4.tcp_window_scaling=1)
Resume Support: Always use -c/--continue flags
Bandwidth Control: Limit with --limit-rate=2M (lftp/aria2)

Testing a 10GB repository:
- wget -m: 42 minutes
- lftp --parallel=8: 9 minutes
- aria2 --split=16: 7 minutes