When troubleshooting disk-bound systems, Linux offers several powerful command-line utilities:
# Install essential monitoring tools
sudo apt-get install sysstat iotop dstat -y # Debian/Ubuntu
sudo yum install sysstat iotop dstat -y # RHEL/CentOS
The iotop
utility provides top
-like functionality for disk I/O:
sudo iotop -oPa # Show only active processes with accumulated I/O
Key columns to watch:
- DISK READ/DISK WRITE: Current throughput
- SWAPIN: Swap activity indicator
- IO>: Percentage of time spent in I/O
For a broader system view including disk utilization percentage:
dstat -td --disk-util --disk-tps 1 10 # 10 samples at 1s intervals
Sample output interpretation:
----system---- --dsk/sda-- --dsk/sdb--
time | util tps | util tps
12:00:01 | 45% 120 | 12% 15
The sar
tool from sysstat provides historical data analysis:
sar -d -p 1 3 # Report device activity 3 times at 1s intervals
Key metrics:
- %util: Percentage of CPU time spent with I/O requests
- await: Average wait time for I/O (ms)
- svctm: Average service time (ms)
To approximate percentage of maximum bandwidth used:
#!/bin/bash
DEVICE=sda
MAX_IOPS=$(cat /sys/block/${DEVICE}/queue/nr_requests)
CURRENT_IO=$(iostat -d /dev/${DEVICE} -x 1 2 | awk 'NR==4 {print $2}')
UTIL_PCT=$(( (CURRENT_IO * 100) / MAX_IOPS ))
echo "Disk bandwidth utilization: ${UTIL_PCT}%"
When logging is the bottleneck, consider:
- Asynchronous logging: Use memory buffers (e.g., rsyslog with
$ActionQueueType LinkedList
) - Log rotation: Frequent rotation of smaller files
- RAM disks: For temporary logs (
tmpfs
) - Rate limiting:
logger --rate-limit
For production systems, implement long-term monitoring:
# node_exporter flags for detailed disk metrics
--collector.diskstats.ignored-devices="^(ram|loop|fd|(h|s|v|xv)d[a-z]|nvme\d+n\d+p)\d+$"
--collector.filesystem.ignored-mount-points="^/(sys|proc|dev|run|var/lib/docker)($|/)"
Key Grafana dashboard metrics to include:
- Disk I/O operations per second
- Average queue length
- Average wait time
- Percentage utilization
When optimizing server performance, disk I/O often becomes the bottleneck after CPU and memory optimizations. Unlike CPU usage which has straightforward monitoring tools like top
or htop
, disk bandwidth measurement requires specialized utilities.
Here are the most powerful tools for monitoring disk activity:
1. iostat (from sysstat package)
The most comprehensive tool for disk I/O statistics:
iostat -dx 1
Key metrics to watch:
%util
- Percentage of CPU time spent servicing I/O requestsawait
- Average time for I/O requests to be servedr/s
andw/s
- Read and write operations per second
2. dstat
Provides real-time visualization of disk usage:
dstat -d --disk-util
This shows disk utilization percentage similar to what you requested.
3. iotop
Like top
but for disk I/O:
sudo iotop -o
Shows processes sorted by their disk I/O activity.
For more precise bandwidth measurement, you can create a simple bash script:
#!/bin/bash
DEVICE="sda"
INTERVAL=1
while true; do
T1=$(cat /sys/block/$DEVICE/stat | awk '{print $1}')
sleep $INTERVAL
T2=$(cat /sys/block/$DEVICE/stat | awk '{print $1}')
READ_IOPS=$(( (T2 - T1) / INTERVAL ))
echo "Disk read IOPS: $READ_IOPS"
done
Here's how to translate the numbers into actionable insights:
- 70-100% utilization: Your disk is the bottleneck
- High await times: Indicates disk queueing
- Consistent 100% utilization: Consider RAID or SSD upgrade
For your logging bottleneck concern, consider these solutions:
# Use ramdisk for temporary logs
sudo mount -t tmpfs -o size=512m tmpfs /var/log/temp
# Or use logrotate more aggressively
/var/log/app.log {
daily
rotate 7
compress
delaycompress
missingok
notifempty
create 640 root adm
sharedscripts
postrotate
/etc/init.d/rsyslog reload >/dev/null 2>&1 || true
endscript
}