Optimizing Parallel cURL Requests in Bash for High-Performance API Calls


12 views

When dealing with multiple API endpoints or web services, executing requests sequentially can create significant performance bottlenecks. In modern bash scripting, parallel execution of cURL requests becomes essential when:

  • Making simultaneous API calls to different endpoints
  • Fetching large datasets that can be retrieved in parallel
  • Implementing health checks for distributed systems
  • Reducing total execution time for batch operations

The simplest approach uses the background operator (&) to run commands in parallel:

# Basic parallel execution
curl -s "https://api.example.com/data1" > output1.json &
curl -s "https://api.example.com/data2" > output2.json &
curl -s "https://api.example.com/data3" > output3.json &
curl -s "https://api.example.com/data4" > output4.json &
curl -s "https://api.example.com/data5" > output5.json &

# Wait for all background jobs to complete
wait

For more sophisticated control, GNU parallel offers better job management:

# Install GNU parallel if needed
# sudo apt-get install parallel (Debian/Ubuntu)
# brew install parallel (MacOS)

# Execute 5 parallel cURL requests
printf "%s\n" https://api.example.com/data{1..5} | \
parallel -j5 curl -s {} ">" output{#}.json

Proper error handling is crucial for production scripts:

# Parallel execution with error handling
urls=(
  "https://api.example.com/data1"
  "https://api.example.com/data2"
  "https://api.example.com/data3"
  "https://api.example.com/data4"
  "https://api.example.com/data5"
)

for url in "${urls[@]}"; do
  (
    response=$(curl -s -w "%{http_code}" "$url")
    http_code=${response: -3}
    content=${response%???}
    
    if [ "$http_code" -eq 200 ]; then
      echo "$content" > "output_${url##*/}.json"
    else
      echo "Error $http_code for $url" >&2
    fi
  ) &
done

wait

When implementing parallel cURL requests:

  • Adjust the number of parallel jobs (-j) based on your system's capabilities
  • Implement rate limiting when dealing with third-party APIs
  • Consider adding small delays between batches to prevent throttling
  • Monitor memory usage when processing large numbers of requests

Another approach using xargs:

# Using xargs for parallel execution
printf "%s\n" https://api.example.com/data{1..5} | \
xargs -P5 -I{} curl -s {} -o "output_{#}.json"

When dealing with web requests in bash, executing multiple cURL commands sequentially creates significant latency. The total execution time becomes the sum of all individual requests. For performance-critical applications, this approach becomes unacceptable.

The simplest parallel execution method uses the ampersand (&) operator to run commands in background:

#!/bin/bash

curl -s "https://api.example.com/endpoint1" & \
curl -s "https://api.example.com/endpoint2" & \
curl -s "https://api.example.com/endpoint3" & \
curl -s "https://api.example.com/endpoint4" & \
curl -s "https://api.example.com/endpoint5" & \
wait

The wait command ensures the script doesn't exit until all background processes complete.

For more control and better resource management, GNU parallel offers robust features:

#!/bin/bash

seq 1 5 | parallel -j5 "curl -s 'https://api.example.com/endpoint{}'"

Key advantages include:

  • -j flag controls maximum parallel jobs
  • Automatic load balancing
  • Built-in retry mechanisms

When running parallel requests, output management becomes crucial:

#!/bin/bash

urls=(
  "https://api.example.com/endpoint1"
  "https://api.example.com/endpoint2"
  "https://api.example.com/endpoint3"
  "https://api.example.com/endpoint4"
  "https://api.example.com/endpoint5"
)

for url in "${urls[@]}"; do
  (curl -s "$url" > "${url##*/}.json" 2> "${url##*/}.error") &
done
wait

For optimal performance:

  • Limit parallel requests based on network bandwidth
  • Implement connection pooling when possible
  • Consider HTTP/2 for multiplexed requests
  • Add appropriate timeouts (--connect-timeout, --max-time)

While cURL is ubiquitous, other tools might better suit specific scenarios:

Tool Pros Cons
aria2c Resume support, segmented downloads HTTP-focused
wget2 Built-in parallel support Less flexible than cURL
HTTPie Simpler syntax Fewer advanced features