When managing headless Linux servers, understanding disk consumption patterns is crucial for maintenance and capacity planning. Treemap visualizations provide immediate spatial awareness of storage allocation, making them superior to traditional du -h
output for identifying space hogs.
For servers without GUI environments, these command-line tools create treemap-compatible data:
# ncdu - export scan results for later visualization
ncdu -o scan_results.json /path/to/scan
# dust - alternative with color-coded output
dust -d 3 / | less -R
# gdu - Go-based disk analyzer
gdu --non-interactive --export json > disk_usage.json
The most effective approach combines CLI scanning with remote visualization:
- On the server:
- Transfer results to local machine:
- Visualize locally:
# Scan with detailed output
ncdu -x / --exclude /mnt --output ncdu_results.json
scp user@server:/path/ncdu_results.json ~/disk_analysis/
# Using ncdu's HTML export
ncdu --import ncdu_results.json --export-html disk_report.html
# Or with k4dirstat (for KDE users)
k4dirstat --import ncdu_results.json
When graphical tools aren't available, consider these structured CLI alternatives:
# Tree-like output with sizes
du -h --max-depth=3 | sort -hr | less
# Interactive navigation
ncurses-based tools:
- ncdu (already mentioned)
- vdu (Vim-like interface)
- gdu (TUI interface)
For programmatic analysis, this Python script generates treemap-ready JSON:
import os
import json
from collections import defaultdict
def get_dir_size(start_path):
total_size = 0
for dirpath, dirnames, filenames in os.walk(start_path):
for f in filenames:
fp = os.path.join(dirpath, f)
total_size += os.path.getsize(fp)
return total_size
def build_tree(path):
tree = {'name': os.path.basename(path), 'children': []}
try:
entries = os.listdir(path)
for entry in entries:
full_path = os.path.join(path, entry)
if os.path.isdir(full_path):
tree['children'].append(build_tree(full_path))
else:
tree['children'].append({
'name': entry,
'size': os.path.getsize(full_path)
})
tree['size'] = get_dir_size(path)
except PermissionError:
pass
return tree
with open('disk_tree.json', 'w') as f:
json.dump(build_tree('/'), f)
For teams needing shared access to disk analysis:
# Serve results via HTTP (Python one-liner)
python3 -m http.server 8000 --directory /path/to/exported_report
# Or use dedicated tools:
- Diskover-web (Elasticsearch backend)
- DaisyDisk Web Viewer
- Custom D3.js treemap visualization
Working with headless Linux servers often means sacrificing visual tools like KDirStat or WinDirStat. Here's how to analyze disk usage effectively through SSH:
ncdu (NCurses Disk Usage) provides interactive exploration:
# Install (Debian/Ubuntu)
sudo apt install ncdu
# Scan filesystem
ncdu /path/to/scan
# Export results
ncdu -o scan_results.json /
dust offers intuitive directory summaries:
cargo install du-dust
dust /var --depth 2
For true treemap visualization, consider this pipeline:
# 1. Generate data (Python example)
import subprocess
result = subprocess.run(['du', '-h', '--max-depth=5', '/'],
capture_output=True,
text=True)
# 2. Process for visualization (save to CSV)
with open('disk_usage.csv', 'w') as f:
f.write("path,size\n")
for line in result.stdout.splitlines():
size, path = line.split('\t')
f.write(f"{path},{size.replace('G','000').replace('M','')}\n")
Transfer the CSV to a workstation and visualize with:
- D3.js treemap (browser-based)
- RAWGraphs (open source visualization tool)
- Python matplotlib:
import pandas as pd
import matplotlib.pyplot as plt
import squarify
df = pd.read_csv('disk_usage.csv')
squarify.plot(sizes=df['size'], label=df['path'], alpha=.8)
plt.axis('off')
plt.show()
Web-based solutions:
# Serve directory via HTTP (Python 3)
python3 -m http.server 8000 --directory /path/to/share
Then access from any device with a browser and use web-based tools.
JSON API endpoints:
# FastAPI example
from fastapi import FastAPI
import subprocess
app = FastAPI()
@app.get("/disk-usage")
def get_disk_usage():
result = subprocess.run(['du', '-s', '--block-size=1M', '/*'],
capture_output=True,
text=True)
return {"data": result.stdout}