When transferring files between macOS and Linux systems, you'll encounter filename encoding discrepancies due to different Unicode normalization forms. macOS uses NFD (Normalization Form D) while most Linux systems use NFC (Normalization Form C). For example, the character "ÿ" might be stored as:
// NFD (macOS): 'y' + '◌̈' (U+0079 + U+0308)
// NFC (Linux): 'ÿ' (U+00FF)
First verify if your files are affected. On Linux, run:
find . -name "*" -exec sh -c '
for f; do
if [[ $(echo "$f" | perl -C -ne "print NFC(\$_) ne NFD(\$_)") ]]; then
echo "$f"
fi
done
' sh {} +
For new transfers, add this to your rsync command:
rsync -av --iconv=utf-8-mac,utf-8 source/ user@server:destination/
The --iconv
option handles the NFD→NFC conversion during transfer. Note that this requires rsync ≥3.0.0.
For existing files, convmv is the most reliable tool:
# First dry run to check changes
convmv -f utf8 -t utf8 --nfc -r --notest /path/to/files
# Actual conversion
convmv -f utf8 -t utf8 --nfc -r /path/to/files
If you're using netatalk (afpd), modify your configuration:
# In /etc/netatalk/afpd.conf
- -noddp -uamlist uams_dhx.so,uams_dhx2.so
+ -noddp -uamlist uams_dhx.so,uams_dhx2.so -mapposix:nomacutf8
For systems without convmv, use this Python 3 script:
#!/usr/bin/env python3
import os
import sys
from unicodedata import normalize
def convert_filenames(path):
for root, dirs, files in os.walk(path):
for name in files + dirs:
old_path = os.path.join(root, name)
new_name = normalize('NFC', name)
if new_name != name:
new_path = os.path.join(root, new_name)
os.rename(old_path, new_path)
print(f"Renamed: {old_path} → {new_path}")
if __name__ == "__main__":
convert_filenames(sys.argv[1] if len(sys.argv) > 1 else '.')
To avoid future issues:
- Set your macOS terminal to always use NFC:
export NFD2NFC=1
- For shared folders, consider using SMB instead of AFP with
vfs_fruit
module
When transferring files between macOS and Linux systems, filename encoding issues often arise due to differing Unicode normalization forms. macOS uses NFD (Normalization Form D) decomposition while most Linux systems use NFC (Normalization Form C) composition. This causes files with special characters (like "ü", "é", or "ø") to appear corrupted or inaccessible.
After using rsync from macOS to Linux:
- Files appear in directory listings but vanish when accessed
- AFP shares report "file not found" for valid files
- Applications like iTunes fail to load media files with special characters
- Example problematic filename: "Queensrÿche" becomes "Queensry\u0308che"
For new transfers, add these rsync flags:
rsync -av --iconv=utf-8-mac,utf-8 source/ destination/
This performs real-time conversion during transfer. The utf-8-mac
is Apple's NFD variant.
For existing files, install convmv on Linux:
sudo apt-get install convmv # Debian/Ubuntu
sudo yum install convmv # RHEL/CentOS
Run conversion (dry-run first!):
convmv -r -f utf-8 -t utf-8 --nfc --notest /path/to/files
Flags explanation:
-r
: recursive--nfc
: convert to NFC--notest
: actually perform (remove for dry run)
Edit /etc/afpd.conf
:
-myname:"My Server" -nouservol -uamlist uams_guest.so,uams_dhx.so -savepassword -maccharset UTF-8-MAC
Key parameter:
-maccharset UTF-8-MAC
: tells AFP to expect NFD from macOS clients
Create a Python conversion script (nfd2nfc.py
):
#!/usr/bin/env python3
import os
import sys
import unicodedata
def convert_path(path):
dirname, filename = os.path.split(path)
return os.path.join(dirname, unicodedata.normalize('NFC', filename))
for root, dirs, files in os.walk(sys.argv[1]):
for name in files + dirs:
old = os.path.join(root, name)
new = convert_path(old)
if old != new:
os.rename(old, new)
Run with:
python3 nfd2nfc.py /mnt/media/Music
For ongoing syncs:
- Always use
--iconv
with rsync - Consider SMB instead of AFP if possible (better Unicode handling)
- Set client-side macOS normalization:
defaults write .GlobalPreferences AppleDecomposedPrecomposedUnicode -bool false