When dealing with resource-intensive operations in shell scripting, we often need to:
- Run multiple commands in parallel to maximize resource utilization
- Control the concurrency level (N commands at a time)
- Wait for each batch to complete before proceeding to the next
- Handle job completion detection reliably
While lock files work, newer bash versions (4.0+) provide better alternatives:
#!/bin/bash
# Set the maximum number of parallel jobs
MAX_JOBS=3
# Array to hold job PIDs
declare -a PID_ARRAY=()
# Command batches
BATCHES=(
"dd if=/dev/urandom of=/mnt/1/x bs=1024 count=1024000000"
"dd if=/dev/urandom of=/mnt/2/x bs=1024 count=1024000000"
"dd if=/dev/urandom of=/mnt/3/x bs=1024 count=1024000000"
# Additional commands...
)
for cmd in "${BATCHES[@]}"; do
# Run command in background
eval "$cmd" &
# Store PID
PID_ARRAY+=($!)
# When we reach MAX_JOBS, wait for completion
if [[ ${#PID_ARRAY[@]} -eq $MAX_JOBS ]]; then
wait "${PID_ARRAY[@]}"
PID_ARRAY=() # Reset array
fi
done
# Wait for any remaining jobs
wait "${PID_ARRAY[@]}"
For more sophisticated control, GNU parallel is ideal:
# Install if needed: sudo apt-get install parallel
# Example usage:
printf '%s\n' \
"dd if=/dev/urandom of=/mnt/1/x bs=1024 count=1024000000" \
"dd if=/dev/urandom of=/mnt/2/x bs=1024 count=1024000000" \
"dd if=/dev/urandom of=/mnt/3/x bs=1024 count=1024000000" | \
parallel -j 3 --halt soon,fail=1
# Options:
# -j N : Run N jobs in parallel
# --halt : Control behavior on failure
Here's a robust pattern for error handling:
#!/bin/bash
report_status() {
local task_id=$1
local status=$2
echo "$(date) - Task $task_id completed with status $status" >> /var/log/parallel_tasks.log
# Could add database update, API call, etc.
}
MAX_JOBS=3
declare -A TASK_STATUS
execute_task() {
local cmd="$1"
local task_id="$2"
if eval "$cmd"; then
TASK_STATUS[$task_id]="SUCCESS"
report_status "$task_id" 0
else
TASK_STATUS[$task_id]="FAILED"
report_status "$task_id" 1
return 1
fi
}
# Main execution loop
task_id=0
while IFS= read -r cmd; do
((task_id++))
execute_task "$cmd" "$task_id" &
# Control concurrency
if (( $(jobs -r | wc -l) >= MAX_JOBS )); then
wait -n
fi
done < tasks.list
# Wait for remaining jobs
wait
# Generate final report
echo "=== Task Completion Report ==="
for id in "${!TASK_STATUS[@]}"; do
echo "Task $id: ${TASK_STATUS[$id]}"
done
When implementing batch parallel processing:
- Monitor system load (consider adding uptime checks)
- For IO-heavy tasks like your dd examples, consider staggering starts
- Use ionice for disk-intensive operations
- Implement timeout controls for hung processes
Example with timeout controls:
# Timeout after 60 seconds
timeout 60 dd if=/dev/urandom of=/mnt/1/x bs=1024 count=1024000000
When automating system tasks, we often need to execute multiple commands in parallel while maintaining batch control. The requirement is to:
- Run N commands simultaneously (typically 3-5 in a batch)
- Wait for all commands in current batch to complete
- Proceed to the next batch
Here are three robust approaches without external dependencies:
Method 1: Using Job Control
#!/bin/bash
batch_size=3
commands=(
"dd if=/dev/urandom of=/mnt/1/x bs=1024 count=1024000"
"dd if=/dev/urandom of=/mnt/2/x bs=1024 count=1024000"
"dd if=/dev/urandom of=/mnt/3/x bs=1024 count=1024000"
"dd if=/dev/urandom of=/mnt/4/x bs=1024 count=1024000"
"dd if=/dev/urandom of=/mnt/5/x bs=1024 count=1024000"
)
for ((i=0; i<${#commands[@]}; i+=batch_size)); do
for ((j=0; j
Method 2: Using GNU Parallel
#!/bin/bash
commands_file=$(mktemp)
cat > $commands_file <
For production systems, consider this enhanced version:
#!/bin/bash
set -o pipefail
run_batch() {
local commands=("$@")
local pids=()
for cmd in "${commands[@]}"; do
eval "$cmd" | tee -a "${cmd//[^[:alnum:]]/_}.log" & pids+=($!)
done
for pid in "${pids[@]}"; do
wait $pid || { echo "Command failed"; exit 1; }
done
}
commands_list=(
# Batch 1
"dd if=/dev/urandom of=/mnt/1/x bs=1024 count=1024000"
"dd if=/dev/urandom of=/mnt/2/x bs=1024 count=1024000"
# Batch 2
"dd if=/dev/urandom of=/mnt/3/x bs=1024 count=1024000"
"dd if=/dev/urandom of=/mnt/4/x bs=1024 count=1024000"
)
for ((i=0; i<${#commands_list[@]}; i+=2)); do
batch=("${commands_list[@]:i:2}")
run_batch "${batch[@]}"
done
When dealing with I/O intensive operations like the dd examples:
- Monitor system load (use vmstat 1)
- Consider using ionice for disk operations
- Add delay between batches if needed (sleep 2)
.PHONY: all
BATCH_SIZE := 3
tasks:
@seq 1 10 | xargs -I{} echo "dd if=/dev/urandom of=/mnt/{}/x bs=1024 count=1024000" > tasks.list
all: tasks
@split -l $(BATCH_SIZE) tasks.list batch.
@for batch in batch.*; do \
make -j$(BATCH_SIZE) -f $$batch || exit 1; \
rm $$batch; \
done
@rm tasks.list