When trying to modify URLs across multiple HTML files, many developers hit a wall with special character escaping in sed. The initial attempts often fail because:
# This fails due to incorrect escaping
sed -i '/http:/\\/\\cdn1/http:/\\/\\cdn1/' file.html
# This fails because of command syntax issues
sed -i '/http:\\/\\/cdn1/http:\\/\\/cdn1/' file.html
The correct approach uses different delimiters to avoid escaping hell:
# Using | as delimiter instead of /
sed -i 's|http://cdn1|https://cdn1|g' *.html
# Or with full domain replacement
sed -i 's|http://cdn1.domain.com|https://cdn1.domain.com|g' *.html
To safely process all 200 HTML files:
# Basic recursive find + sed
find . -type f -name "*.html" -exec sed -i 's|http://cdn1|https://cdn1|g' {} +
# With backup files (recommended for safety)
find . -type f -name "*.html" -exec sed -i.bak 's|http://cdn1|https://cdn1|g' {} +
For more complex URL patterns:
# Match any CDN subdomain
sed -i 's|http://cdn[0-9]\.|https://cdn\1.|g' *.html
# Case insensitive matching
sed -i 's|http://$[Cc][Dd][Nn][0-9]$|https://\1|g' *.html
Always verify your changes:
# Count matches before/after
grep -r "http://cdn1" . | wc -l
grep -r "https://cdn1" . | wc -l
# Dry run (no changes)
find . -name "*.html" -exec grep -l "http://cdn1" {} + | xargs sed 's|http://cdn1|https://cdn1|g'
When migrating web content from HTTP to HTTPS, developers often need to update numerous HTML files. A common scenario is changing http://cdn1.domain.com
to https://cdn1.domain.com
across hundreds of files. While manual editing is impractical, Linux's sed
stream editor provides an efficient solution.
The initial attempts show common pitfalls when working with sed
:
# Problem 1: Incorrect escape sequence
sed -i '/http:/\\/\\cdn1/http:/\\/\\cdn1/' cum-comand.html
# Error: unknown command: \\'
# Problem 2: Missing substitution command
sed -i '/http:\\/\\/cdn1/http:\\/\\/cdn1/' cum-comand.html
# Error: extra characters after command
The proper syntax for global in-place replacement across multiple HTML files:
sed -i 's|http://cdn1|https://cdn1|g' *.html
For more complex requirements, consider these variations:
Case 1: Preserving Subdomains and Paths
sed -i 's|http://cdn1\.domain\.com|https://cdn1.domain.com|g' *.html
Case 2: Recursive Directory Processing
find . -type f -name "*.html" -exec sed -i 's|http://cdn1|https://cdn1|g' {} +
Case 3: Dry Run Verification
sed 's|http://cdn1|https://cdn1|g' example.html | grep https://cdn1
For large-scale replacements (200+ files):
- Use
parallel
withsed
for multi-core processing - Consider
ripgrep
orack
for faster file searching - Backup files first:
cp -r html_files/ html_files_backup/
Tool | Command | Best For |
---|---|---|
perl | perl -pi -e 's/http:\/\/cdn1/https:\/\/cdn1/g' *.html |
Complex regex |
awk | awk '{gsub(/http:\/\/cdn1/, "https://cdn1"); print}' file.html |
Structured data |
vim | :argdo %s/http:\/\/cdn1/https:\/\/cdn1/gc | update |
Interactive review |