How to Monitor Disk I/O Load and Bandwidth Utilization on Linux Servers


2 views

When troubleshooting disk-bound systems, Linux offers several powerful command-line utilities:


# Install essential monitoring tools
sudo apt-get install sysstat iotop dstat -y  # Debian/Ubuntu
sudo yum install sysstat iotop dstat -y      # RHEL/CentOS

The iotop utility provides top-like functionality for disk I/O:


sudo iotop -oPa  # Show only active processes with accumulated I/O

Key columns to watch:

  • DISK READ/DISK WRITE: Current throughput
  • SWAPIN: Swap activity indicator
  • IO>: Percentage of time spent in I/O

For a broader system view including disk utilization percentage:


dstat -td --disk-util --disk-tps 1 10  # 10 samples at 1s intervals

Sample output interpretation:


----system---- --dsk/sda-- --dsk/sdb--
  time     |  util  tps  |  util  tps  
12:00:01  |  45%   120  |  12%   15

The sar tool from sysstat provides historical data analysis:


sar -d -p 1 3  # Report device activity 3 times at 1s intervals

Key metrics:

  • %util: Percentage of CPU time spent with I/O requests
  • await: Average wait time for I/O (ms)
  • svctm: Average service time (ms)

To approximate percentage of maximum bandwidth used:


#!/bin/bash
DEVICE=sda
MAX_IOPS=$(cat /sys/block/${DEVICE}/queue/nr_requests)
CURRENT_IO=$(iostat -d /dev/${DEVICE} -x 1 2 | awk 'NR==4 {print $2}')
UTIL_PCT=$(( (CURRENT_IO * 100) / MAX_IOPS ))
echo "Disk bandwidth utilization: ${UTIL_PCT}%"

When logging is the bottleneck, consider:

  1. Asynchronous logging: Use memory buffers (e.g., rsyslog with $ActionQueueType LinkedList)
  2. Log rotation: Frequent rotation of smaller files
  3. RAM disks: For temporary logs (tmpfs)
  4. Rate limiting: logger --rate-limit

For production systems, implement long-term monitoring:


# node_exporter flags for detailed disk metrics
--collector.diskstats.ignored-devices="^(ram|loop|fd|(h|s|v|xv)d[a-z]|nvme\d+n\d+p)\d+$"
--collector.filesystem.ignored-mount-points="^/(sys|proc|dev|run|var/lib/docker)($|/)"

Key Grafana dashboard metrics to include:

  • Disk I/O operations per second
  • Average queue length
  • Average wait time
  • Percentage utilization

When optimizing server performance, disk I/O often becomes the bottleneck after CPU and memory optimizations. Unlike CPU usage which has straightforward monitoring tools like top or htop, disk bandwidth measurement requires specialized utilities.

Here are the most powerful tools for monitoring disk activity:

1. iostat (from sysstat package)

The most comprehensive tool for disk I/O statistics:

iostat -dx 1

Key metrics to watch:

  • %util - Percentage of CPU time spent servicing I/O requests
  • await - Average time for I/O requests to be served
  • r/s and w/s - Read and write operations per second

2. dstat

Provides real-time visualization of disk usage:

dstat -d --disk-util

This shows disk utilization percentage similar to what you requested.

3. iotop

Like top but for disk I/O:

sudo iotop -o

Shows processes sorted by their disk I/O activity.

For more precise bandwidth measurement, you can create a simple bash script:

#!/bin/bash
DEVICE="sda"
INTERVAL=1

while true; do
    T1=$(cat /sys/block/$DEVICE/stat | awk '{print $1}')
    sleep $INTERVAL
    T2=$(cat /sys/block/$DEVICE/stat | awk '{print $1}')
    READ_IOPS=$(( (T2 - T1) / INTERVAL ))
    echo "Disk read IOPS: $READ_IOPS"
done

Here's how to translate the numbers into actionable insights:

  • 70-100% utilization: Your disk is the bottleneck
  • High await times: Indicates disk queueing
  • Consistent 100% utilization: Consider RAID or SSD upgrade

For your logging bottleneck concern, consider these solutions:

# Use ramdisk for temporary logs
sudo mount -t tmpfs -o size=512m tmpfs /var/log/temp

# Or use logrotate more aggressively
/var/log/app.log {
    daily
    rotate 7
    compress
    delaycompress
    missingok
    notifempty
    create 640 root adm
    sharedscripts
    postrotate
        /etc/init.d/rsyslog reload >/dev/null 2>&1 || true
    endscript
}