When working with tar archives from different systems, you'll often encounter mismatches between symbolic user/group names in the archive and your local /etc/passwd
. The tar format actually stores both textual and numeric UID/GID information, but most command-line tools default to showing only the symbolic names.
The simplest way to view numeric IDs:
tar tvf example.tar --numeric-owner
This forces tar to display numeric UIDs/GIDs regardless of whether matching users exist locally. Example output:
-rw-r--r-- 1000/1000 12345 2023-05-01 12:34 example.txt
For deeper inspection of the exact binary values stored in the tar file:
od -Ad -tx1 -N 512 example.tar | head
This shows the raw header where UID/GID are stored at specific offsets (bytes 108-115 for UID, 116-123 for GID in octal).
Here's a bash script that extracts numeric ownership info for all files:
#!/bin/bash
tar --numeric-owner -tvf "$1" | awk '{print $2, $3}' | \
grep -oP '\d+/\d+' | sort | uniq -c
Sample output showing UID/GID frequency:
42 1000/1000
15 0/0
3 33/33
To validate if numeric permissions match your expectations:
tar --numeric-owner -tvf backup.tar | grep -v ' 0/0 ' | grep -v ' 1000/1000 '
This filters out common expected values (root and primary user) to spot anomalies.
When working with sparse files or PAX headers (common in modern tar formats), use:
tar --numeric-owner --pax-option=exthdr.name=%d/PaxHeaders/%f -tvf example.tar
This properly handles extended attributes while maintaining numeric output.
When working with tar archives containing system backups, you might encounter situations where file ownership information appears incorrect or doesn't match your system's /etc/passwd
entries. The standard tar tvf
command shows symbolic owner names, but these might not reflect the actual numeric UIDs/GIDs stored in the archive.
To see the actual numeric values stored in the tar file, use the --numeric-owner
option:
tar tvf example.tar --numeric-owner
This will display output like:
-rw-r--r-- 1000/1000 12345 2023-01-01 12:34 path/to/file.txt
Where 1000/1000 represents the UID/GID pair.
For even more detailed information, combine with verbose mode:
tar tvvf example.tar --numeric-owner
This provides additional fields in the output including the actual numeric IDs.
If you need to examine just the headers without extracting files:
tar tvf example.tar --numeric-owner | head -n 20
Or for a specific file:
tar tvf example.tar --numeric-owner path/to/specific/file
For analyzing multiple files, this bash script extracts all UID/GID pairs:
#!/bin/bash
tar tvf "$1" --numeric-owner | awk '{print $3}' | sort | uniq -c | sort -n
Save as analyze_tar_ids.sh
and run:
./analyze_tar_ids.sh example.tar
To cross-reference with your system's user database:
tar tvf example.tar --numeric-owner | awk '{print $3}' | cut -d'/' -f1 | sort -u | while read uid; do
grep "^[^:]*:[^:]*:$uid:" /etc/passwd || echo "UID $uid not found in /etc/passwd"
done
For GNU tar versions, you might need to use:
tar --numeric-owner -tvf example.tar
Some BSD tar implementations require:
tar -tvf example.tar --numeric
For binary inspection of tar headers (shows raw UID/GID values):
od -j 108 -N 8 -t d4 example.tar
This shows the first file's UID and GID in decimal format.