Optimizing Parallel cURL Requests in Bash for High-Performance API Calls


1 views

When dealing with multiple API endpoints or web services, executing requests sequentially can create significant performance bottlenecks. In modern bash scripting, parallel execution of cURL requests becomes essential when:

  • Making simultaneous API calls to different endpoints
  • Fetching large datasets that can be retrieved in parallel
  • Implementing health checks for distributed systems
  • Reducing total execution time for batch operations

The simplest approach uses the background operator (&) to run commands in parallel:

# Basic parallel execution
curl -s "https://api.example.com/data1" > output1.json &
curl -s "https://api.example.com/data2" > output2.json &
curl -s "https://api.example.com/data3" > output3.json &
curl -s "https://api.example.com/data4" > output4.json &
curl -s "https://api.example.com/data5" > output5.json &

# Wait for all background jobs to complete
wait

For more sophisticated control, GNU parallel offers better job management:

# Install GNU parallel if needed
# sudo apt-get install parallel (Debian/Ubuntu)
# brew install parallel (MacOS)

# Execute 5 parallel cURL requests
printf "%s\n" https://api.example.com/data{1..5} | \
parallel -j5 curl -s {} ">" output{#}.json

Proper error handling is crucial for production scripts:

# Parallel execution with error handling
urls=(
  "https://api.example.com/data1"
  "https://api.example.com/data2"
  "https://api.example.com/data3"
  "https://api.example.com/data4"
  "https://api.example.com/data5"
)

for url in "${urls[@]}"; do
  (
    response=$(curl -s -w "%{http_code}" "$url")
    http_code=${response: -3}
    content=${response%???}
    
    if [ "$http_code" -eq 200 ]; then
      echo "$content" > "output_${url##*/}.json"
    else
      echo "Error $http_code for $url" >&2
    fi
  ) &
done

wait

When implementing parallel cURL requests:

  • Adjust the number of parallel jobs (-j) based on your system's capabilities
  • Implement rate limiting when dealing with third-party APIs
  • Consider adding small delays between batches to prevent throttling
  • Monitor memory usage when processing large numbers of requests

Another approach using xargs:

# Using xargs for parallel execution
printf "%s\n" https://api.example.com/data{1..5} | \
xargs -P5 -I{} curl -s {} -o "output_{#}.json"

When dealing with web requests in bash, executing multiple cURL commands sequentially creates significant latency. The total execution time becomes the sum of all individual requests. For performance-critical applications, this approach becomes unacceptable.

The simplest parallel execution method uses the ampersand (&) operator to run commands in background:

#!/bin/bash

curl -s "https://api.example.com/endpoint1" & \
curl -s "https://api.example.com/endpoint2" & \
curl -s "https://api.example.com/endpoint3" & \
curl -s "https://api.example.com/endpoint4" & \
curl -s "https://api.example.com/endpoint5" & \
wait

The wait command ensures the script doesn't exit until all background processes complete.

For more control and better resource management, GNU parallel offers robust features:

#!/bin/bash

seq 1 5 | parallel -j5 "curl -s 'https://api.example.com/endpoint{}'"

Key advantages include:

  • -j flag controls maximum parallel jobs
  • Automatic load balancing
  • Built-in retry mechanisms

When running parallel requests, output management becomes crucial:

#!/bin/bash

urls=(
  "https://api.example.com/endpoint1"
  "https://api.example.com/endpoint2"
  "https://api.example.com/endpoint3"
  "https://api.example.com/endpoint4"
  "https://api.example.com/endpoint5"
)

for url in "${urls[@]}"; do
  (curl -s "$url" > "${url##*/}.json" 2> "${url##*/}.error") &
done
wait

For optimal performance:

  • Limit parallel requests based on network bandwidth
  • Implement connection pooling when possible
  • Consider HTTP/2 for multiplexed requests
  • Add appropriate timeouts (--connect-timeout, --max-time)

While cURL is ubiquitous, other tools might better suit specific scenarios:

Tool Pros Cons
aria2c Resume support, segmented downloads HTTP-focused
wget2 Built-in parallel support Less flexible than cURL
HTTPie Simpler syntax Fewer advanced features