Convert IDN to Punycode in Bash: A Practical Guide for Developers


2 views

Internationalized Domain Names (IDNs) containing non-ASCII characters need conversion to Punycode for DNS resolution. In bash environments, we have several efficient ways to perform this conversion.

The most straightforward method is using idn2 from GNU Libidn package:


# Install libidn2 if needed (Ubuntu/Debian)
sudo apt-get install libidn2-utils

# Convert IDN to Punycode
idn2 президент.рф
# Output: xn--d1abbgf6aiiy.xn--p1ai

If libidn2 isn't available, Python's idna encoding works well:


python3 -c "import codecs; print(codecs.encode('президент.рф', 'idna').decode('ascii'))"

For systems without Python or libidn2:


echo "президент.рф" | perl -MNet::IDN::Encode -nE 'say domain_to_ascii($_)'

Process a list of domains from a file:


while read -r domain; do
  echo "$domain → $(idn2 "$domain")"
done < domains.txt
  • Ensure the input is UTF-8 encoded: echo -n "测试.com" | iconv -t utf-8
  • Handle mixed-case domains properly
  • Validate outputs with idn2 --debug when encountering errors

For bulk processing (>1000 domains), the Python method is generally faster:


time python3 -c "import sys,codecs; [print(codecs.encode(line.strip(),'idna').decode('ascii')) for line in sys.stdin]" < domains.txt

Internationalized Domain Names (IDNs) containing non-ASCII characters need to be converted to ASCII-compatible Punycode for DNS resolution. For example, the Russian domain "президент.рф" must be converted to "xn--d1abbgf6aiiy.xn--p1ai" for system processing.

Modern Linux distributions include the idn command from GNU Libidn package:

# Basic conversion
idn --quiet --punycode-encode "президент.рф"

# Batch processing
echo "президент.рф" | idn --quiet --punycode-encode

If idn isn't available, you can use Python's built-in support:

python3 -c "import sys; from encodings.idna import nameprep; print(''.join([nameprep(label).decode('idna') for label in sys.argv[1].encode('utf-8').split(b'.')]))" "президент.рф"

Or using Perl's Net::IDN module:

perl -MNet::IDN -e 'print Net::IDN::to_ascii($ARGV[0], "utf8")."\n"' "президент.рф"

For processing multiple domains in a file (domains.txt):

while IFS= read -r domain; do
    idn --punycode-encode "$domain" >> punycoded.txt
done < domains.txt

Add error checking to your script:

if ! command -v idn &> /dev/null; then
    echo "Error: idn command not found. Install GNU Libidn package." >&2
    exit 1
fi

domain="президент.рф"
punycode=$(idn --quiet --punycode-encode "$domain" 2>/dev/null)

if [ -z "$punycode" ]; then
    echo "Conversion failed for $domain" >&2
else
    echo "$punycode"
fi

For large-scale conversions, Python or Perl implementations are generally faster than shell loops. Here's a benchmark for converting 10,000 domains:

time while read -r d; do idn --quiet "$d"; done < large_list.txt > output.txt

versus Python one-liner:

time python3 -c "import sys,encodings.idna; [print(encodings.idna.nameprep(line.strip()).decode('idna')) for line in sys.stdin]" < large_list.txt > output.txt