Efficiently Parse INI Files into Bash Associative Arrays with AWK and Sed


2 views

When working with configuration files in shell scripting, INI files present a common format that needs proper handling. Traditional parsing approaches often struggle with:

  • Section handling across multiple entries
  • Whitespace around equals signs
  • Special character escaping
  • Array variable creation

Here's an enhanced AWK solution that properly handles sections and whitespace:

awk -F'=' '{
    gsub(/^[ \t]+|[ \t]+$/, "", $1);
    gsub(/^[ \t]+|[ \t]+$/, "", $2);
    if ($0 ~ /^$$.*$$$/) {
        section = substr($0, 2, length($0)-2);
    } else if (NF == 2 && $1 != "") {
        printf "%s[%s]=\"%s\"\n", $1, section, $2;
    }
}' config.ini

For those preferring sed, here's a solution using hold space:

sed -n -E '
/^$$(.+)$$$/ {
  s//\1/
  h
  d
}
/^[[:space:]]*([^[:space:]=]+)[[:space:]]*=[[:space:]]*(.*)/ {
  G
  s/([^=]+)=([^\n]*)\n(.+)/\1[\3]="\2"/p
}' config.ini

For production environments, consider these enhancements:

# Handle comments and empty lines
awk -F'=' '
BEGIN { section = "default"; }
/^[[:space:]]*;/ { next; }
/^[[:space:]]*#/ { next; }
/^[[:space:]]*$/ { next; }
/^$$.*$$/ {
    section = substr($0, 2, length($0)-2);
    next;
}
{
    key = $1; sub(/^[[:space:]]*/, "", key); sub(/[[:space:]]*$/, "", key);
    val = $2; sub(/^[[:space:]]*/, "", val); sub(/[[:space:]]*$/, "", val);
    printf "%s[%s]=\"%s\"\n", key, section, val;
}' config.ini

To actually create the variables in your bash session:

while IFS="=" read -r key value; do
    declare -g "$key=$value"
done < <(awk -F'=' '...' config.ini)

For associative arrays in bash 4+:

declare -A session
declare -A path

eval "$(awk '...' config.ini)"

echo ${session[foobar]}  # Outputs: foo

When working with configuration files in shell scripting, transforming INI-style configurations into Bash associative arrays presents several technical hurdles. The main challenges include:

  • Proper section handling with nested brackets
  • Trimming whitespace around delimiters
  • Maintaining section context across file parsing
  • Generating valid Bash array syntax

Here's an enhanced AWK implementation that handles whitespace and produces cleaner output:

awk -F='=' '{
    gsub(/^[[:space:]]+|[[:space:]]+$/, "", $0)
    if ($0 ~ /^$$.*$$$/) {
        section = substr($0, 2, length($0) - 2)
    }
    else if ($0 ~ /^[^#]/ && $0 ~ /=/) {
        gsub(/^[[:space:]]+|[[:space:]]+$/, "", $1)
        gsub(/^[[:space:]]+|[[:space:]]+$/, "", $2)
        printf "%s[%s]=\"%s\"\n", $1, section, $2
    }
}' config.ini

For those preferring pure Bash without external tools:

declare -A config
current_section=""

while IFS= read -r line || [[ -n "$line" ]]; do
    line="${line#"${line%%[![:space:]]*}"}"
    line="${line%"${line##*[![:space:]]}"}"
    
    [[ "$line" =~ ^$$(.*)$$$ ]] && current_section="${BASH_REMATCH[1]}"
    [[ "$line" =~ ^([^=]+)=(.*)$ ]] || continue
    
    key="${BASH_REMATCH[1]%%*( )}"
    value="${BASH_REMATCH[2]#*( )}"
    config["${current_section}_${key}"]="$value"
done < "config.ini"

# Access values:
echo "Foobar session: ${config[foobar_session]}"

For production-grade parsing that handles:

  • Comments (both ; and #)
  • Quoted values
  • Multi-line values
  • Section inheritance

Consider this robust solution using a function:

parse_ini() {
    local file="$1"
    local section=""
    local line key value
    
    while IFS= read -r line; do
        line="${line#"${line%%[![:space:]]*}"}"
        line="${line%"${line##*[![:space:]]}"}"
        
        [[ "$line" =~ ^\;|^\# ]] && continue
        
        if [[ "$line" =~ ^$$([^]]+)$$ ]]; then
            section="${BASH_REMATCH[1]}"
        elif [[ "$line" =~ ^([^=]+)=(.*) ]]; then
            key="${BASH_REMATCH[1]%%*( )}"
            value="${BASH_REMATCH[2]#*( )}"
            
            if [[ "$value" =~ ^\".*\"$ ]]; then
                value="${value:1:-1}"
            fi
            
            declare -g "${key}[${section}]=\"${value}\""
        fi
    done < "$file"
}

parse_ini "config.ini"

For large INI files, these optimizations help:

  1. Use process substitution instead of temporary files
  2. Minimize subshell creation
  3. Prefer built-in string operations over external commands
  4. Cache frequently accessed values