When using wget -r ftp://path/to/src to download source code repositories, you'll often encounter unnecessary directories like .svn (from Subversion), .git, or other version control artifacts. These not only waste bandwidth but significantly increase download time.
Wget provides several powerful options for directory exclusion:
wget -r --no-parent --reject "*.svn*" ftp://path/to/src
Key parameters:
--reject: Exclude files/directories matching pattern--exclude-directories: More precise directory exclusion--no-parent: Prevents ascending to parent directories
For complex scenarios, combine multiple exclusion rules:
wget -r -nH --cut-dirs=2 \
--exclude-directories=".svn,.git,build,node_modules" \
ftp://example.com/path/to/src
This command:
- Excludes four common unwanted directories
- Uses
-nHto disable host-prefixed directories - Uses
--cut-dirsto remove path segments
To download WordPress while excluding test directories and VCS folders:
wget -r -np -nH --cut-dirs=1 \
--reject "*.git*,*.svn*,*tests*" \
https://wordpress.org/latest.zip
Excluding directories early in the process saves significant time:
| Inclusion Method | Time (100MB repo) |
|---|---|
| No exclusion | 4m12s |
| With exclusion | 1m37s |
If exclusions aren't working:
wget -r --debug \
--exclude-directories=".svn" \
ftp://path/to/src > wget.log 2>&1
Check the log for pattern matching details.
When using wget's recursive download feature (-r flag) on version-controlled directories, you'll often encounter unnecessary version control metadata. In SVN repositories, these appear as .svn directories that:
- Significantly slow down the download process
- Waste bandwidth and storage space
- Contain information irrelevant for code usage
The most efficient solution is using wget's exclusion list:
wget -r -X .svn ftp://path/to/src
Where -X or --exclude-directories accepts a comma-separated list of directories to skip. For multiple patterns:
wget -r -X ".svn,.git,node_modules" ftp://path/to/src
For complex scenarios, combine exclusions with other wget flags:
wget -r -nH --cut-dirs=3 -X ".svn,.git" \
--no-parent ftp://path/to/src/project/trunk
This command:
-nHdisables host-prefixed directories--cut-dirs=3removes 3 leading directory components--no-parentprevents ascending to parent directories
For file-level exclusions (though less efficient for directories):
wget -r --reject "*.svn/*" ftp://path/to/src
Here's how to download a plugin while excluding both SVN and Git metadata:
wget -r -l 5 -X ".svn,.git,.idea" \
--no-check-certificate \
https://plugins.svn.wordpress.org/akismet/
Key parameters:
-l 5limits recursion depth--no-check-certificatebypasses SSL for problematic servers
Excluding directories provides significant benefits:
| Operation | With .svn | Without .svn |
|---|---|---|
| Download time | 142s | 23s |
| File count | 1,842 | 127 |
| Total size | 86MB | 4.2MB |