How to Replace HTTP with HTTPS URLs in Multiple Files Using sed Command in Linux


2 views

When trying to modify URLs across multiple HTML files, many developers hit a wall with special character escaping in sed. The initial attempts often fail because:

# This fails due to incorrect escaping
sed -i '/http:/\\/\\cdn1/http:/\\/\\cdn1/' file.html

# This fails because of command syntax issues
sed -i '/http:\\/\\/cdn1/http:\\/\\/cdn1/' file.html

The correct approach uses different delimiters to avoid escaping hell:

# Using | as delimiter instead of /
sed -i 's|http://cdn1|https://cdn1|g' *.html

# Or with full domain replacement
sed -i 's|http://cdn1.domain.com|https://cdn1.domain.com|g' *.html

To safely process all 200 HTML files:

# Basic recursive find + sed
find . -type f -name "*.html" -exec sed -i 's|http://cdn1|https://cdn1|g' {} +

# With backup files (recommended for safety)
find . -type f -name "*.html" -exec sed -i.bak 's|http://cdn1|https://cdn1|g' {} +

For more complex URL patterns:

# Match any CDN subdomain
sed -i 's|http://cdn[0-9]\.|https://cdn\1.|g' *.html

# Case insensitive matching
sed -i 's|http://$[Cc][Dd][Nn][0-9]$|https://\1|g' *.html

Always verify your changes:

# Count matches before/after
grep -r "http://cdn1" . | wc -l
grep -r "https://cdn1" . | wc -l

# Dry run (no changes)
find . -name "*.html" -exec grep -l "http://cdn1" {} + | xargs sed 's|http://cdn1|https://cdn1|g'

When migrating web content from HTTP to HTTPS, developers often need to update numerous HTML files. A common scenario is changing http://cdn1.domain.com to https://cdn1.domain.com across hundreds of files. While manual editing is impractical, Linux's sed stream editor provides an efficient solution.

The initial attempts show common pitfalls when working with sed:

# Problem 1: Incorrect escape sequence
sed -i '/http:/\\/\\cdn1/http:/\\/\\cdn1/' cum-comand.html
# Error: unknown command: \\'

# Problem 2: Missing substitution command
sed -i '/http:\\/\\/cdn1/http:\\/\\/cdn1/' cum-comand.html
# Error: extra characters after command

The proper syntax for global in-place replacement across multiple HTML files:

sed -i 's|http://cdn1|https://cdn1|g' *.html

For more complex requirements, consider these variations:

Case 1: Preserving Subdomains and Paths

sed -i 's|http://cdn1\.domain\.com|https://cdn1.domain.com|g' *.html

Case 2: Recursive Directory Processing

find . -type f -name "*.html" -exec sed -i 's|http://cdn1|https://cdn1|g' {} +

Case 3: Dry Run Verification

sed 's|http://cdn1|https://cdn1|g' example.html | grep https://cdn1

For large-scale replacements (200+ files):

  • Use parallel with sed for multi-core processing
  • Consider ripgrep or ack for faster file searching
  • Backup files first: cp -r html_files/ html_files_backup/
Tool Command Best For
perl perl -pi -e 's/http:\/\/cdn1/https:\/\/cdn1/g' *.html Complex regex
awk awk '{gsub(/http:\/\/cdn1/, "https://cdn1"); print}' file.html Structured data
vim :argdo %s/http:\/\/cdn1/https:\/\/cdn1/gc | update Interactive review