When working with wget
, developers often face an annoying default behavior: the tool refuses to overwrite existing files without explicit permission. This "safety feature" can become problematic in automation scripts or when you're absolutely certain you want fresh copies.
Many developers mistakenly try these common but incorrect approaches:
wget -nc http://example.com/file.html # Wrong: Prevents overwriting
wget -N http://example.com/file.html # Wrong: Only overwrites if server version is newer
Here's how to truly force overwrites in different scenarios:
# Basic force overwrite
wget -O existing_file.html http://example.com/file.html
# For recursive downloads with force
wget -r --no-host-directories --force-directories --no-clobber -O target_file.html http://example.com/path/file.html
# When you need to preserve original filename
wget --backups=1 http://example.com/file.html && mv file.html.1 file.html
This technique shines in:
- Continuous integration pipelines that need fresh assets
- Cron jobs for regular data updates
- Automated deployment scripts
Remember that forced overwrites:
- Bypass all file comparison checks
- May cause data loss if used carelessly
- Should be combined with proper error handling in scripts
When using wget to download files, it typically employs a conservative approach by default to prevent accidental data loss. The default behavior includes:
- If a local file exists with the same name, wget appends
.1
,.2
, etc. to the filename - No automatic overwriting occurs unless explicitly instructed
- Timestamp comparison is performed when using certain options
While -nc
or --no-clobber
prevents overwriting, we actually want the opposite behavior. Here's the correct approach:
wget -O file1.html http://server/folder/file1.html
Or alternatively:
wget --output-document=file1.html http://server/folder/file1.html
Using -O
(uppercase O) specifies the exact output filename, forcing wget to:
- Overwrite any existing file with the specified name
- Ignore timestamp comparisons
- Bypass the automatic numbering system
For batch operations or scripting, consider these variations:
# Overwrite multiple files in a loop
for url in \
"http://example.com/file1.txt" \
"http://example.com/file2.txt"
do
filename=$(basename "$url")
wget -O "$filename" "$url"
done
Or when dealing with dynamic content:
# Force overwrite while ignoring server-side modifications
wget -O output.html --no-use-server-timestamps http://example.com/page
Be cautious with these scenarios:
- Using relative vs. absolute paths with
-O
- Permission issues on the target directory
- Accidentally overwriting important files in scripts
For production scripts, consider adding safeguards:
#!/bin/bash
TARGET_FILE="important.html"
BACKUP_FILE="${TARGET_FILE}.bak"
# Create backup before overwriting
cp "$TARGET_FILE" "$BACKUP_FILE" 2>/dev/null || true
# Perform the download with forced overwrite
wget -O "$TARGET_FILE" http://example.com/important.html