When working with bash scripts, we often encounter situations where we need to store complex multi-line strings - particularly XML fragments or JSON data - directly in the script. Traditional approaches using regular strings can become messy:
# Problematic approach:
xml="<?xml version=\"1.0\" encoding='UTF-8'?>\n<painting>\n <img src=\"madonna.jpg\" alt='Foligno Madonna, by Raphael'/>\n <caption>This is Raphael's \"Foligno\" Madonna, painted in\n <date>1511</date>-<date>1512</date>.</caption>\n</painting>"
This requires excessive escaping and makes the XML unreadable within the script.
Bash provides a clean solution through "here documents" using the <<EOF
syntax:
# Clean multi-line XML storage
read -r -d '' xml_content <<EOF
<?xml version="1.0" encoding='UTF-8'?>
<painting>
<img src="madonna.jpg" alt='Foligno Madonna, by Raphael'/>
<caption>This is Raphael's "Foligno" Madonna, painted in
<date>1511</date>-<date>1512</date>.</caption>
</painting>
EOF
Key advantages of this approach:
- No need to escape quotes or special characters
- Preserves original formatting and indentation
- Maintains human readability within the script
- Works with complex nested XML/JSON structures
For more complex scenarios, consider these variations:
# With variable substitution
read -r -d '' email_template <<EOF
From: $sender
To: $recipient
Subject: $subject
$message_body
EOF
# With command substitution
read -r -d '' deployment_config <<EOF
{
"version": "$(git rev-parse HEAD)",
"environment": "$ENV",
"timestamp": "$(date -u)"
}
EOF
For content that might contain the delimiter (EOF in our examples), use a unique delimiter:
# Using a unique delimiter
read -r -d '' sql_query <<'UNIQUE_DELIMITER'
SELECT * FROM users
WHERE created_at > '2023-01-01'
AND status = 'active'
UNIQUE_DELIMITER
Note the single quotes around the delimiter name which prevents variable expansion within the here document.
For very large multi-line strings (10KB+), consider these optimizations:
# Store in a function for better performance
generate_large_xml() {
cat <<EOF
<?xml version="1.0"?>
<data>
$(for i in {1..10000}; do echo " <record id=\"$i\"/>"; done)
</data>
EOF
}
xml_content=$(generate_large_xml)
When working with bash scripts, one common challenge is cleanly storing multi-line strings like XML fragments. Traditional methods using double quotes or escaping characters quickly become unreadable and error-prone, especially with complex XML structures.
Bash's heredoc syntax provides the perfect solution for this scenario. It allows:
- Preservation of all formatting and indentation
- No need for character escaping (except the delimiter itself)
- Human-readable code structure
#!/bin/bash
read -r -d '' xml_content <<'EOF'
<?xml version="1.0" encoding='UTF-8'?>
<painting>
<img src="madonna.jpg" alt='Foligno Madonna, by Raphael'/>
<caption>This is Raphael's "Foligno" Madonna, painted in
<date>1511</date>-<date>1512</date>.</caption>
</painting>
EOF
echo "$xml_content"
The heredoc approach offers several useful variations:
# Version with variable substitution
cat <<EOF
<config>
<user>$USER</user>
<home>$HOME</home>
</config>
EOF
# Version without variable substitution (literal)
cat <<'EOF'
<config>
<user>$USER</user> # Will not be expanded
</config>
EOF
For XML containing the delimiter string itself, you can use a more unique delimiter:
read -r -d '' sql_query <<'SQL_END'
SELECT * FROM users
WHERE name LIKE 'John%'
AND created_at > '2023-01-01'
SQL_END
When dealing with very large XML fragments, consider these alternatives:
- Store in separate files and read with
$(cat file.xml)
- Use base64 encoding for binary XML or special characters
- Consider XML-aware tools like
xmlstarlet
for complex manipulations
Best Practices for Writing Multi-line XML/String Literals in Bash Scripts
3 views