When dealing with multiple API endpoints or web services, executing requests sequentially can create significant performance bottlenecks. In modern bash scripting, parallel execution of cURL requests becomes essential when:
- Making simultaneous API calls to different endpoints
- Fetching large datasets that can be retrieved in parallel
- Implementing health checks for distributed systems
- Reducing total execution time for batch operations
The simplest approach uses the background operator (&) to run commands in parallel:
# Basic parallel execution
curl -s "https://api.example.com/data1" > output1.json &
curl -s "https://api.example.com/data2" > output2.json &
curl -s "https://api.example.com/data3" > output3.json &
curl -s "https://api.example.com/data4" > output4.json &
curl -s "https://api.example.com/data5" > output5.json &
# Wait for all background jobs to complete
wait
For more sophisticated control, GNU parallel offers better job management:
# Install GNU parallel if needed
# sudo apt-get install parallel (Debian/Ubuntu)
# brew install parallel (MacOS)
# Execute 5 parallel cURL requests
printf "%s\n" https://api.example.com/data{1..5} | \
parallel -j5 curl -s {} ">" output{#}.json
Proper error handling is crucial for production scripts:
# Parallel execution with error handling
urls=(
"https://api.example.com/data1"
"https://api.example.com/data2"
"https://api.example.com/data3"
"https://api.example.com/data4"
"https://api.example.com/data5"
)
for url in "${urls[@]}"; do
(
response=$(curl -s -w "%{http_code}" "$url")
http_code=${response: -3}
content=${response%???}
if [ "$http_code" -eq 200 ]; then
echo "$content" > "output_${url##*/}.json"
else
echo "Error $http_code for $url" >&2
fi
) &
done
wait
When implementing parallel cURL requests:
- Adjust the number of parallel jobs (-j) based on your system's capabilities
- Implement rate limiting when dealing with third-party APIs
- Consider adding small delays between batches to prevent throttling
- Monitor memory usage when processing large numbers of requests
Another approach using xargs:
# Using xargs for parallel execution
printf "%s\n" https://api.example.com/data{1..5} | \
xargs -P5 -I{} curl -s {} -o "output_{#}.json"
When dealing with web requests in bash, executing multiple cURL commands sequentially creates significant latency. The total execution time becomes the sum of all individual requests. For performance-critical applications, this approach becomes unacceptable.
The simplest parallel execution method uses the ampersand (&) operator to run commands in background:
#!/bin/bash
curl -s "https://api.example.com/endpoint1" & \
curl -s "https://api.example.com/endpoint2" & \
curl -s "https://api.example.com/endpoint3" & \
curl -s "https://api.example.com/endpoint4" & \
curl -s "https://api.example.com/endpoint5" & \
wait
The wait
command ensures the script doesn't exit until all background processes complete.
For more control and better resource management, GNU parallel offers robust features:
#!/bin/bash
seq 1 5 | parallel -j5 "curl -s 'https://api.example.com/endpoint{}'"
Key advantages include:
- -j flag controls maximum parallel jobs
- Automatic load balancing
- Built-in retry mechanisms
When running parallel requests, output management becomes crucial:
#!/bin/bash
urls=(
"https://api.example.com/endpoint1"
"https://api.example.com/endpoint2"
"https://api.example.com/endpoint3"
"https://api.example.com/endpoint4"
"https://api.example.com/endpoint5"
)
for url in "${urls[@]}"; do
(curl -s "$url" > "${url##*/}.json" 2> "${url##*/}.error") &
done
wait
For optimal performance:
- Limit parallel requests based on network bandwidth
- Implement connection pooling when possible
- Consider HTTP/2 for multiplexed requests
- Add appropriate timeouts (--connect-timeout, --max-time)
While cURL is ubiquitous, other tools might better suit specific scenarios:
Tool | Pros | Cons |
---|---|---|
aria2c | Resume support, segmented downloads | HTTP-focused |
wget2 | Built-in parallel support | Less flexible than cURL |
HTTPie | Simpler syntax | Fewer advanced features |