When working with large batch jobs in Unix/Linux systems, many developers encounter this frustrating behavior:
find . -name "*.log" | xargs -n 50 process_logs.sh
# Stops immediately if any command returns 255
This happens because xargs treats exit code 255 as a special case that triggers immediate termination, as documented in the man pages. For batch processing, this default behavior is often undesirable.
Here are three effective approaches to handle this:
# Solution 1: Using GNU xargs --no-run-if-empty and --max-procs
find . -name "*.log" | xargs -n 50 --no-run-if-empty --max-procs=4 \
--process-slot-var=SLOT process_logs.sh || true
# Solution 2: Wrapper script that handles exit codes
#!/bin/bash
process_logs.sh "$@" || exit 1 # Never return 255
For your 1500-line batch job with 50-line chunks and logging:
#!/bin/bash
# batch_processor.sh
LOG_FILE="processing_$(date +%Y%m%d).log"
JOB_LIST="jobs.txt"
# Process with error continuation
cat "$JOB_LIST" | xargs -n 50 -P 4 -I{} bash -c '
echo "Processing batch: {}" >> "$0"
execute_job.sh {} >> "$0" 2>&1 || echo "Failed: {}" >> "$0"
' "$LOG_FILE"
# Post-processing analysis
grep -c "Failed:" "$LOG_FILE" || echo "All batches completed successfully"
For mission-critical systems, consider these patterns:
# Pattern 1: Retry mechanism
retry_command() {
local n=3
for ((i=1; i<=n; i++)); do
"$@" && return 0
sleep $((i*2))
done
return 1
}
export -f retry_command
find . -name "*.tmp" | xargs -n 50 -P 8 -I{} bash -c 'retry_command process_file "{}"'
# Pattern 2: Error continuation with progress tracking
parallel --joblog joblog.csv --resume-failed --progress -N50 process_batch.sh ::: $(seq 1 1500)
When processing large batch jobs with xargs
, many developers encounter this frustrating behavior: if any command in the pipeline returns exit status 255, xargs immediately terminates the entire process. According to the manual:
$ man xargs
[...]
If any invocation of the command exits with a status of 255,
xargs will stop immediately without reading any further input.
An error message is issued on stderr when this happens.
[...]
Here are three battle-tested approaches to handle this in real-world scenarios:
Solution 1: Using --no-run-if-empty with Error Handling
# Process 50 items at a time, continue on errors
cat joblist.txt | xargs -n 50 -P 4 --no-run-if-empty \
sh -c 'for arg; do your_command "$arg" || continue; done' _
Solution 2: Wrapper Script Approach
Create a wrapper script (process_wrapper.sh
):
#!/bin/bash
set -euo pipefail
for item in "$@"; do
if ! process_item "$item"; then
echo "Error processing $item" >&2
# Return status other than 255
exit 1
fi
done
Then execute with:
cat joblist.txt | xargs -n 50 ./process_wrapper.sh
Combining with GNU Parallel
# More robust alternative with better error handling
parallel --jobs 4 --halt never --joblog job.log \
:::: joblist.txt ::: your_command
Logging and Monitoring Implementation
# Full production-ready example with logging
timestamp=$(date +%Y%m%d_%H%M%S)
logfile="batch_${timestamp}.log"
{
cat joblist.txt | \
xargs -n 50 -P 4 -I {} sh -c '
for item in {}; do
if ! your_command "$item"; then
echo "FAILED: $item" >&2
continue
fi
echo "PROCESSED: $item"
done
'
} > "$logfile" 2>&1
In continuous integration systems, partial batch job failures can cause deployment bottlenecks. The solutions above ensure:
- Complete processing of all items
- Proper error isolation
- Comprehensive logging
- Resource efficiency (parallel processing)
For mission-critical systems, consider adding database tracking of processed items to enable restart capability.