The `bs` parameter in `dd` determines the number of bytes transferred in a single operation. Through extensive benchmarking across different storage devices (MMC and HDD), we observe significant performance variations based on block size selection.
# MMC Card (SanDisk Extreme Pro)
dd if=/dev/sdc of=/dev/null bs=4 count=250000000
→ 12MB/s (250M operations)
dd if=/dev/sdc of=/dev/null bs=1M count=1000
→ 14.1MB/s (1k operations)
Storage controllers have native transfer sizes (typically 4KB for modern SSDs). Matching `bs` to these values reduces:
- System call overhead (visible in sys time metrics)
- Interrupt coalescing requirements
- DMA setup operations
For general use:
# Safe defaults for most modern systems
dd if=/dev/sdX of=/dev/sdY bs=4M iflag=direct oflag=direct
When maximum throughput matters:
# Performance test with varying block sizes
for bs in 512 1K 4K 64K 1M 4M; do
echo "Testing bs=$bs"
dd if=/dev/zero of=testfile bs=$bs count=1K oflag=direct
sync; rm testfile
done
Combine block size with:
- `oflag/direct`: Bypasses page cache
- `conv=fdatasync`: Ensures physical write completion
- `iflag/nocache`: Prevents read caching
Example production-grade backup command:
dd if=/dev/sda bs=4M iflag=fullblock |
pv -s $(blockdev --getsize64 /dev/sda) |
dd of=/dev/sdb bs=4M oflag=direct conv=fdatasync
While default 512B blocks work, optimized sizes (typically 1MB-4MB) can yield 2-3x throughput improvements. The exact sweet spot requires benchmarking on your specific hardware stack.
The dd
command's block size (bs
) parameter determines how much data is read/written in a single operation. Through extensive testing across different hardware configurations, we've observed significant performance variations:
# Sample benchmark command structure
time dd if=/dev/sdX of=/dev/null bs=Y count=Z
Our tests reveal several consistent patterns across storage devices:
MMC Card Performance
- bs=4: 12MB/s (surprisingly close to maximum throughput)
- bs≥5: 14.1-14.3MB/s (peak performance)
- bs=1: Drops to 3.1MB/s (worst case)
HDD Performance
- bs=10: 29.9MB/s
- bs=512: 95.3MB/s (default value)
- bs=1M: 97.6MB/s (optimal for this hardware)
Smaller block sizes significantly increase system CPU time:
# bs=1 (HDD)
real 5m41.463s
user 0m56.000s
sys 4m44.340s
# bs=1M (HDD)
real 0m10.792s
user 0m0.008s
sys 0m1.144s
Based on these benchmarks, we recommend:
- For quick operations: Use
bs=1M
as a safe default - For maximum throughput: Test
bs
values from 4K to 16M - When copying between devices: Match block sizes (
bs
andobs
)
This script helps find optimal block size for your hardware:
#!/bin/bash
DEVICE=$1
TEST_FILE=/tmp/dd_test.img
# Create test file
dd if=/dev/zero of=$TEST_FILE bs=1M count=1024
echo "Testing optimal block size for $DEVICE"
echo "Block Size,MB/s"
for bs in 512 1K 4K 16K 64K 256K 1M 4M 16M
do
sync
echo 3 > /proc/sys/vm/drop_caches
speed=$(dd if=$TEST_FILE of=$DEVICE bs=$bs 2>&1 | grep -o '[0-9.]\+ MB/s')
echo "$bs,$speed"
done
rm $TEST_FILE
The performance impact stems from:
- System call overhead (read/write operations)
- Device I/O scheduler behavior
- Filesystem block size alignment
- DMA buffer sizes