How to Find and Analyze Largest Directories in Linux/Unix Systems for Disk Space Management


3 views

When managing Linux servers, discovering large directories consuming disk space is a common administrative task. The du (disk usage) command combined with sorting and filtering provides the most effective solution.

du -h --max-depth=1 / | sort -rh | head -n 10

This command pipeline:

  • du -h --max-depth=1 / shows human-readable sizes for top-level directories
  • sort -rh sorts results in reverse numerical order
  • head -n 10 displays only the top 10 largest entries

For more detailed analysis, consider these variations:

# Analyze specific directory with full path display
du -ah /var/log | sort -rh | head -n 20

# Exclude mount points
du -h -x --max-depth=1 / | sort -rh

# Show only directories over 1GB
du -h --threshold=1G /home

For interactive exploration, install and use ncdu:

sudo apt install ncdu  # Debian/Ubuntu
sudo yum install ncdu  # RHEL/CentOS
ncdu /

This provides a navigable interface showing directory sizes with percentage breakdowns.

Create a cron job for regular monitoring:

# Daily report in /var/log/disk_usage.log
0 3 * * * root du -h --max-depth=3 / | sort -rh > /var/log/disk_usage.log

Common culprits include:

  • Log files in /var/log
  • Docker containers in /var/lib/docker
  • Temporary files in /tmp
  • User home directories with large media files

To find and delete old log files:

find /var/log -type f -name "*.log" -mtime +30 -exec rm -f {} \;

html

When your Linux server screams "low disk space", the du (disk usage) command becomes your first responder. Here's the most effective one-liner I use daily:

du -h --max-depth=1 / | sort -rh | head -n 20

This command breakdown:

  • -h: Human-readable sizes (KB, MB, GB)
  • --max-depth=1: Only show immediate subdirectories
  • sort -rh: Sort reverse numerically with human-readable values
  • head -n 20: Display top 20 results

For more granular analysis, try these variations:

# Analyze specific directory
du -h --max-depth=2 /var | sort -rh | head -n 15

# Exclude mounted filesystems
du -h --max-depth=1 -x / | sort -rh

# Show both files and directories
du -ah /home | sort -rh | head -n 25

When CLI output isn't enough, ncdu (NCurses Disk Usage) provides interactive visualization:

# Install ncdu (Ubuntu/Debian)
sudo apt install ncdu

# Basic scan
ncdu /

# Export results for later analysis
ncdu -o scan_results /path/to/scan

Sometimes directories contain massive individual files. Find them with:

find / -type f -size +500M -exec ls -lh {} + 2>/dev/null

Pro tip: Add -delete flag after testing to remove found files (use with extreme caution).

Create a cron job for regular monitoring:

# Add to crontab -e
0 3 * * * /usr/bin/du -h --max-depth=1 / | /usr/bin/sort -rh | /usr/bin/head -n 20 > /var/log/disk_usage.log

Special cases for common space hogs:

# Check log files
sudo du -h /var/log | sort -rh

# Inspect Docker storage
docker system df
docker system prune --volumes

For faster scanning on large filesystems:

  • Use --time flag to see scan duration
  • Add --exclude for non-essential directories
  • Run during low-activity periods