How to Systematically Compare Two Red Hat Linux Servers Using File Checksums and rsync


9 views

When maintaining identical Red Hat Enterprise Linux (RHEL) server configurations, we often need to audit file-level differences beyond simple directory listings. Your approach using find with SHA1 checksums is fundamentally sound, but let's explore more robust implementations and alternative methods.

The enhanced version of your checksum approach:

#!/bin/bash
# Generate checksums excluding special directories
find / $-path /proc -o -path /sys -o -path /dev -o -path /run -o -path /tmp$ \
  -prune -o -type f -exec sha256sum {} + > /root/server_checksums.sha256

# Compare between servers (after transferring one file)
scp root@server2:/root/server_checksums.sha256 server2.sha256
awk 'NR==FNR{a[$1]=$2;next}!($1 in a)' server_checksums.sha256 server2.sha256

rsync can indeed provide comprehensive difference reports without actual file transfer:

rsync -nrcv --delete --itemize-changes server1:/ server2:/ > diff_report.txt

Key flags explanation:

  • -n: Dry run (no actual transfer)
  • -c: Checksum comparison (slower but more accurate)
  • --itemize-changes: Detailed change codes

For package-level verification in RHEL systems:

rpm -qa --qf "%{NAME}-%{VERSION}-%{RELEASE}.%{ARCH}\n" | sort > rpm_packages.txt
diff <(ssh root@server1 "rpm -qa --queryformat '%{NAME}\n' | sort") \
     <(ssh root@server2 "rpm -qa --queryformat '%{NAME}\n' | sort")

Targeted comparison for critical config files:

diff -qr /etc/ /etc/ | grep -v -e "\.svn" -e "\.git"

For large filesystems, consider these optimizations:

find / -xdev $-path /proc -o -path /sys -o -path /dev$ -prune -o \
  -type f -size +1M -exec md5sum {} + > large_files.md5

When managing multiple Red Hat Linux servers that should maintain identical configurations, identifying subtle differences becomes crucial for troubleshooting and security. While simple file listings help surface obvious discrepancies, they fail to detect content-level variations.

The basic approach using find gives us a foundation:

# Generate file inventory excluding special directories
find / $-path /proc -o -path /sys -o -path /dev$ -prune -o -print | sort > server1_files.txt

However, this only compares filenames and paths, not contents. For production environments, we need deeper analysis.

A more robust solution involves generating SHA checksums for all files:

# Generate checksums for all files (excluding special dirs)
find / $-path /proc -o -path /sys -o -path /dev -o -path /run -o -path /tmp$ -prune \
-o -type f -exec sha256sum {} \; > server1_checksums.txt

Key improvements over the basic approach:

  • Uses SHA-256 for stronger hashing
  • Excludes additional volatile directories
  • Only processes regular files (-type f)

Transfer both checksum files to one server and compare:

# Sort and compare checksum files
sort server1_checksums.txt > server1_sorted.txt
sort server2_checksums.txt > server2_sorted.txt
diff -u server1_sorted.txt server2_sorted.txt > differences.diff

While rsync is typically used for synchronization, its dry-run mode offers excellent comparison capabilities:

# Perform dry-run checksum comparison
rsync -nrc --out-format="%f %M %l" --delete / root@server2:/ > rsync_diff.txt

Breakdown of flags:

  • -n: dry run (no actual transfer)
  • -r: recursive
  • -c: compare using checksums
  • --delete: show files that would be deleted

For servers with thousands of files, consider these optimizations:

# Parallel checksum generation with GNU parallel
find / -type f | parallel -j8 sha256sum > checksums.txt

# Exclude known variable directories
find / $-path "/proc/*" -o -path "/sys/*" -o -path "/dev/*" \
-o -path "/run/*" -o -path "/var/log/*" -o -path "/var/cache/*"$ \
-prune -o -type f -exec sha256sum {} \;

Create a cron job for scheduled comparisons:

#!/bin/bash
# Generate timestamp
TS=$(date +%Y%m%d_%H%M%S)
# Create checksums
find / -type f -exec sha256sum {} \; > /var/log/checksums/checksum_${TS}.txt
# Compare with baseline
diff /var/log/checksums/baseline.txt /var/log/checksums/checksum_${TS}.txt > /var/log/checksums/diff_${TS}.txt
# Alert if differences found
if [ -s /var/log/checksums/diff_${TS}.txt ]; then
    mail -s "Server configuration drift detected" admin@example.com < /var/log/checksums/diff_${TS}.txt
fi

For production environments:

  1. Establish a known-good baseline checksum file
  2. Regularly compare against this baseline
  3. Combine both checksum and rsync approaches
  4. Document expected differences