How to Display Numeric UIDs/GIDs from Tar Archives When Symbolic Names Don’t Match Local Users


8 views

When working with tar archives from different systems, you'll often encounter mismatches between symbolic user/group names in the archive and your local /etc/passwd. The tar format actually stores both textual and numeric UID/GID information, but most command-line tools default to showing only the symbolic names.

The simplest way to view numeric IDs:

tar tvf example.tar --numeric-owner

This forces tar to display numeric UIDs/GIDs regardless of whether matching users exist locally. Example output:

-rw-r--r-- 1000/1000   12345 2023-05-01 12:34 example.txt

For deeper inspection of the exact binary values stored in the tar file:

od -Ad -tx1 -N 512 example.tar | head

This shows the raw header where UID/GID are stored at specific offsets (bytes 108-115 for UID, 116-123 for GID in octal).

Here's a bash script that extracts numeric ownership info for all files:

#!/bin/bash
tar --numeric-owner -tvf "$1" | awk '{print $2, $3}' | \
  grep -oP '\d+/\d+' | sort | uniq -c

Sample output showing UID/GID frequency:

   42 1000/1000
   15 0/0
    3 33/33

To validate if numeric permissions match your expectations:

tar --numeric-owner -tvf backup.tar | grep -v ' 0/0 ' | grep -v ' 1000/1000 '

This filters out common expected values (root and primary user) to spot anomalies.

When working with sparse files or PAX headers (common in modern tar formats), use:

tar --numeric-owner --pax-option=exthdr.name=%d/PaxHeaders/%f -tvf example.tar

This properly handles extended attributes while maintaining numeric output.


When working with tar archives containing system backups, you might encounter situations where file ownership information appears incorrect or doesn't match your system's /etc/passwd entries. The standard tar tvf command shows symbolic owner names, but these might not reflect the actual numeric UIDs/GIDs stored in the archive.

To see the actual numeric values stored in the tar file, use the --numeric-owner option:

tar tvf example.tar --numeric-owner

This will display output like:

-rw-r--r-- 1000/1000   12345 2023-01-01 12:34 path/to/file.txt

Where 1000/1000 represents the UID/GID pair.

For even more detailed information, combine with verbose mode:

tar tvvf example.tar --numeric-owner

This provides additional fields in the output including the actual numeric IDs.

If you need to examine just the headers without extracting files:

tar tvf example.tar --numeric-owner | head -n 20

Or for a specific file:

tar tvf example.tar --numeric-owner path/to/specific/file

For analyzing multiple files, this bash script extracts all UID/GID pairs:

#!/bin/bash
tar tvf "$1" --numeric-owner | awk '{print $3}' | sort | uniq -c | sort -n

Save as analyze_tar_ids.sh and run:

./analyze_tar_ids.sh example.tar

To cross-reference with your system's user database:

tar tvf example.tar --numeric-owner | awk '{print $3}' | cut -d'/' -f1 | sort -u | while read uid; do
    grep "^[^:]*:[^:]*:$uid:" /etc/passwd || echo "UID $uid not found in /etc/passwd"
done

For GNU tar versions, you might need to use:

tar --numeric-owner -tvf example.tar

Some BSD tar implementations require:

tar -tvf example.tar --numeric

For binary inspection of tar headers (shows raw UID/GID values):

od -j 108 -N 8 -t d4 example.tar

This shows the first file's UID and GID in decimal format.