Benchmarking Disk I/O: How to Measure Transfer Speeds (MB/s vs Mb/s) and Calculate 1500GB Copy Times


4 views

Disk speed is typically measured in megabytes per second (MB/s) for practical applications, though some manufacturers may advertise speeds in megabits per second (Mb/s). The conversion is simple: 1MB/s = 8Mb/s.

// Python conversion example
def mbps_to_mbs(mbps):
    return mbps / 8

print(f"800 Mb/s = {mbps_to_mbs(800)} MB/s")  # Output: 100.0

Here's a breakdown of modern storage performance:

Storage Type Average Speed Fast Cutting Edge
HDD (7200rpm) 80-160 MB/s 200 MB/s N/A
SATA SSD 350-550 MB/s 600 MB/s N/A
NVMe PCIe 3.0 2,000-3,000 MB/s 3,500 MB/s 7,000 MB/s
NVMe PCIe 4.0 5,000 MB/s 7,000 MB/s 12,000 MB/s

The formula for estimating transfer time:

time_seconds = (file_size_GB * 1024) / transfer_speed_MBps
time_hours = time_seconds / 3600

Example calculations:

# Python calculation for various drive types
def calculate_transfer_time(size_gb, speed_mbs):
    seconds = (size_gb * 1024) / speed_mbs
    return seconds / 3600  # return hours

print("1500GB transfer times:")
print(f"HDD (150MB/s): {calculate_transfer_time(1500, 150):.2f} hours")
print(f"NVMe (3000MB/s): {calculate_transfer_time(1500, 3000):.2f} hours")
print(f"PCIe 4.0 (7000MB/s): {calculate_transfer_time(1500, 7000):.2f} hours")

Several elements affect actual transfer speeds:

  • File system overhead (NTFS vs EXT4 vs ZFS)
  • File size distribution (many small files vs single large file)
  • Queue depth and parallel operations
  • Controller limitations (RAID card/HBA throughput)

For database professionals working with 1500GB files:

# Linux rsync with performance tuning
rsync --progress --partial --bwlimit=0 --archive \
      --compress-level=0 /source/path /destination/path

# Windows robocopy with multithreading
robocopy C:\source D:\destination /MT:16 /R:1 /W:1 /ZB /NP /TEE /V /LOG:transfer.log

Consider using these techniques for maximum throughput:

  • Disable antivirus scanning during transfer
  • Use direct I/O bypassing OS caches when appropriate
  • Align block sizes between source and destination
  • Consider network alternatives like iSCSI for remote transfers

html

Disk performance is typically measured in two ways:

  • MB/s (Megabytes per second) - Standard for consumer/professional storage
  • IOPS (Input/Output Operations Per Second) - Important for database workloads

Note: 1 MB/s = 8 Mbit/s. Storage benchmarks generally use MB/s for practical measurements.

Storage Type Average Speed Fast Cutting-edge
HDD (7200 RPM) 100-160 MB/s 200 MB/s -
SATA SSD 400-550 MB/s 600 MB/s -
NVMe SSD (Gen3) 2000-2500 MB/s 3000 MB/s 3500 MB/s
NVMe SSD (Gen4) 4000-5000 MB/s 7000 MB/s 8000 MB/s

Basic formula:

transfer_time = file_size / transfer_speed

Python example:

def calculate_transfer_time(size_gb, speed_mbs):
    size_bytes = size_gb * 1024  # Convert GB to MB
    return size_bytes / speed_mbs

# Example calculations
print(f"HDD (150MB/s): {calculate_transfer_time(1500, 150):.2f} seconds")
print(f"NVMe Gen4 (5000MB/s): {calculate_transfer_time(1500, 5000):.2f} seconds")

Actual transfer speeds are affected by:

  • File system overhead (NTFS vs EXT4 vs ZFS)
  • RAID configuration
  • Queue depth and I/O patterns
  • Controller bottleneck

For database files specifically, consider using direct I/O in your application:

// C++ example using O_DIRECT flag
int fd = open("database.bin", O_RDWR | O_DIRECT, 0666);

For professional systems handling 1500GB files:

  1. Use parallel transfer tools (rsync with --compress-level=0 for binary files)
  2. Consider network-optimized protocols if transferring between systems
  3. Benchmark your actual hardware with:
# Linux example
hdparm -Tt /dev/sdX
# Or for more detailed analysis
fio --filename=/mnt/test/file --size=1500G --direct=1 --rw=write --bs=1M --ioengine=libaio --iodepth=32 --runtime=60 --time_based --group_reporting --name=throughput-test-job