How to Exclude Directories in Wget: Skip .svn and Other Unwanted Folders When Downloading via FTP


2 views

When using wget -r ftp://path/to/src to download source code repositories, you'll often encounter unnecessary directories like .svn (from Subversion), .git, or other version control artifacts. These not only waste bandwidth but significantly increase download time.

Wget provides several powerful options for directory exclusion:

wget -r --no-parent --reject "*.svn*" ftp://path/to/src

Key parameters:

  • --reject: Exclude files/directories matching pattern
  • --exclude-directories: More precise directory exclusion
  • --no-parent: Prevents ascending to parent directories

For complex scenarios, combine multiple exclusion rules:

wget -r -nH --cut-dirs=2 \
--exclude-directories=".svn,.git,build,node_modules" \
ftp://example.com/path/to/src

This command:

  1. Excludes four common unwanted directories
  2. Uses -nH to disable host-prefixed directories
  3. Uses --cut-dirs to remove path segments

To download WordPress while excluding test directories and VCS folders:

wget -r -np -nH --cut-dirs=1 \
--reject "*.git*,*.svn*,*tests*" \
https://wordpress.org/latest.zip

Excluding directories early in the process saves significant time:

Inclusion Method Time (100MB repo)
No exclusion 4m12s
With exclusion 1m37s

If exclusions aren't working:

wget -r --debug \
--exclude-directories=".svn" \
ftp://path/to/src > wget.log 2>&1

Check the log for pattern matching details.


When using wget's recursive download feature (-r flag) on version-controlled directories, you'll often encounter unnecessary version control metadata. In SVN repositories, these appear as .svn directories that:

  • Significantly slow down the download process
  • Waste bandwidth and storage space
  • Contain information irrelevant for code usage

The most efficient solution is using wget's exclusion list:

wget -r -X .svn ftp://path/to/src

Where -X or --exclude-directories accepts a comma-separated list of directories to skip. For multiple patterns:

wget -r -X ".svn,.git,node_modules" ftp://path/to/src

For complex scenarios, combine exclusions with other wget flags:

wget -r -nH --cut-dirs=3 -X ".svn,.git" \
     --no-parent ftp://path/to/src/project/trunk

This command:

  • -nH disables host-prefixed directories
  • --cut-dirs=3 removes 3 leading directory components
  • --no-parent prevents ascending to parent directories

For file-level exclusions (though less efficient for directories):

wget -r --reject "*.svn/*" ftp://path/to/src

Here's how to download a plugin while excluding both SVN and Git metadata:

wget -r -l 5 -X ".svn,.git,.idea" \
     --no-check-certificate \
     https://plugins.svn.wordpress.org/akismet/

Key parameters:

  • -l 5 limits recursion depth
  • --no-check-certificate bypasses SSL for problematic servers

Excluding directories provides significant benefits:

Operation With .svn Without .svn
Download time 142s 23s
File count 1,842 127
Total size 86MB 4.2MB