How to Convert UTF-8 NFD to NFC Filenames in rsync/afpd for Cross-Platform File Sharing


2 views

When transferring files between macOS and Linux systems, you'll encounter filename encoding discrepancies due to different Unicode normalization forms. macOS uses NFD (Normalization Form D) while most Linux systems use NFC (Normalization Form C). For example, the character "ÿ" might be stored as:

// NFD (macOS): 'y' + '◌̈' (U+0079 + U+0308)
// NFC (Linux): 'ÿ' (U+00FF)

First verify if your files are affected. On Linux, run:

find . -name "*" -exec sh -c '
    for f; do
        if [[ $(echo "$f" | perl -C -ne "print NFC(\$_) ne NFD(\$_)") ]]; then
            echo "$f"
        fi
    done
' sh {} +

For new transfers, add this to your rsync command:

rsync -av --iconv=utf-8-mac,utf-8 source/ user@server:destination/

The --iconv option handles the NFD→NFC conversion during transfer. Note that this requires rsync ≥3.0.0.

For existing files, convmv is the most reliable tool:

# First dry run to check changes
convmv -f utf8 -t utf8 --nfc -r --notest /path/to/files

# Actual conversion
convmv -f utf8 -t utf8 --nfc -r /path/to/files

If you're using netatalk (afpd), modify your configuration:

# In /etc/netatalk/afpd.conf
- -noddp -uamlist uams_dhx.so,uams_dhx2.so
+ -noddp -uamlist uams_dhx.so,uams_dhx2.so -mapposix:nomacutf8

For systems without convmv, use this Python 3 script:

#!/usr/bin/env python3
import os
import sys
from unicodedata import normalize

def convert_filenames(path):
    for root, dirs, files in os.walk(path):
        for name in files + dirs:
            old_path = os.path.join(root, name)
            new_name = normalize('NFC', name)
            if new_name != name:
                new_path = os.path.join(root, new_name)
                os.rename(old_path, new_path)
                print(f"Renamed: {old_path} → {new_path}")

if __name__ == "__main__":
    convert_filenames(sys.argv[1] if len(sys.argv) > 1 else '.')

To avoid future issues:

  • Set your macOS terminal to always use NFC: export NFD2NFC=1
  • For shared folders, consider using SMB instead of AFP with vfs_fruit module

When transferring files between macOS and Linux systems, filename encoding issues often arise due to differing Unicode normalization forms. macOS uses NFD (Normalization Form D) decomposition while most Linux systems use NFC (Normalization Form C) composition. This causes files with special characters (like "ü", "é", or "ø") to appear corrupted or inaccessible.

After using rsync from macOS to Linux:

  • Files appear in directory listings but vanish when accessed
  • AFP shares report "file not found" for valid files
  • Applications like iTunes fail to load media files with special characters
  • Example problematic filename: "Queensrÿche" becomes "Queensry\u0308che"

For new transfers, add these rsync flags:

rsync -av --iconv=utf-8-mac,utf-8 source/ destination/

This performs real-time conversion during transfer. The utf-8-mac is Apple's NFD variant.

For existing files, install convmv on Linux:

sudo apt-get install convmv  # Debian/Ubuntu
sudo yum install convmv      # RHEL/CentOS

Run conversion (dry-run first!):

convmv -r -f utf-8 -t utf-8 --nfc --notest /path/to/files

Flags explanation:

  • -r: recursive
  • --nfc: convert to NFC
  • --notest: actually perform (remove for dry run)

Edit /etc/afpd.conf:

-myname:"My Server" -nouservol -uamlist uams_guest.so,uams_dhx.so -savepassword -maccharset UTF-8-MAC

Key parameter:

  • -maccharset UTF-8-MAC: tells AFP to expect NFD from macOS clients

Create a Python conversion script (nfd2nfc.py):

#!/usr/bin/env python3
import os
import sys
import unicodedata

def convert_path(path):
    dirname, filename = os.path.split(path)
    return os.path.join(dirname, unicodedata.normalize('NFC', filename))

for root, dirs, files in os.walk(sys.argv[1]):
    for name in files + dirs:
        old = os.path.join(root, name)
        new = convert_path(old)
        if old != new:
            os.rename(old, new)

Run with:

python3 nfd2nfc.py /mnt/media/Music

For ongoing syncs:

  1. Always use --iconv with rsync
  2. Consider SMB instead of AFP if possible (better Unicode handling)
  3. Set client-side macOS normalization:
    defaults write .GlobalPreferences AppleDecomposedPrecomposedUnicode -bool false