How to Properly Run ssh-agent in a Shell Script and Preserve Command Flow


1 views

When you try to execute ssh-agent $SHELL in a shell script, you encounter an immediate behavioral quirk: the command spawns a new shell instance and halts further script execution. This happens because:

  • ssh-agent launches a child process (new shell)
  • The original script's execution context is effectively suspended
  • Subsequent commands (like ssh-add) never execute

The proper way involves capturing ssh-agent's environment variables:

#!/bin/bash
eval "$(ssh-agent -s)"
ssh-add /path/to/private_key
# Rest of your script continues normally

Method 1: Using ssh-agent with exec

#!/bin/bash
exec ssh-agent bash -c 'ssh-add /path/to/key; your_commands_here'

Method 2: Persistent Agent Management

#!/bin/bash
if [ -z "$SSH_AUTH_SOCK" ]; then
   eval "$(ssh-agent -s)" > /dev/null
   trap "ssh-agent -k" EXIT
fi
ssh-add ~/.ssh/id_rsa

Here's a robust solution with error handling:

#!/bin/bash
# Start ssh-agent if not running
if ! pgrep -u "$USER" ssh-agent > /dev/null; then
    eval "$(ssh-agent -s)" > /dev/null
fi

# Add key with timeout
if ! ssh-add -l | grep -q "your_key_fingerprint"; then
    ssh-add -t 3600 ~/.ssh/id_rsa || {
        echo "Failed to add SSH key" >&2
        exit 1
    }
fi

# Proceed with SSH operations
ssh user@host "your_remote_command"
  • Never hardcode sensitive key paths - use variables
  • Always implement timeout values for added keys
  • Consider using ssh-add -c for confirmation if using GUI
  • Clean up agent processes with trap on script exit

When automating SSH operations in shell scripts, many developers encounter this frustrating behavior:

#!/bin/bash
ssh-agent $SHELL  # Spawns new shell, halting script execution
ssh-add ~/.ssh/id_ed25519  # Never gets executed
echo "This line never runs"

The fundamental issue stems from how ssh-agent works:

  1. When executed with $SHELL, it launches a new shell instance
  2. The parent script's execution pauses waiting for this subshell to exit
  3. Environment variables containing agent info only exist in the subshell

The solution involves capturing and exporting the environment variables:

#!/bin/bash

# Start ssh-agent and capture output
eval "$(ssh-agent -s)"

# Now add your key
ssh-add ~/.ssh/id_rsa_deploy

# Verify the agent has your key
ssh-add -l

# Continue with other operations
git pull origin main
rsync -avz ./ user@remote:/path/

For POSIX-compliant scripts, use this method:

#!/bin/sh

# Start agent and output commands to set variables
ssh-agent -c | grep -v '^echo' > /tmp/ssh-agent.env
. /tmp/ssh-agent.env

ssh-add "$HOME/.ssh/cicd_key"
rm /tmp/ssh-agent.env

For production environments with multiple keys:

#!/bin/bash
set -e  # Exit on error

# Initialize agent
eval "$(ssh-agent -s)"

# Add keys with timeout (2 hours)
ssh-add -t 7200 ~/.ssh/id_rsa_web ~/.ssh/id_rsa_db

# Protect sensitive variables
export SSH_AUTH_SOCK=$(gpgconf --list-dirs agent-ssh-socket)

If things aren't working, try these diagnostic steps:

# Check if agent is running
if [ -z "$SSH_AUTH_SOCK" ]; then
  echo "SSH agent not running"
  exit 1
fi

# Verify key was added
if ! ssh-add -l | grep -q "SHA256:"; then
  echo "No keys loaded in agent"
  exit 1
fi

For enterprise scripts, consider this robust pattern:

#!/bin/bash

# Function to ensure agent is available
init_ssh_agent() {
  if [ -z "$SSH_AUTH_SOCK" ]; then
    eval "$(ssh-agent -s)" > /dev/null
    trap 'ssh-agent -k' EXIT  # Clean up on exit
  fi
}

# Main execution
init_ssh_agent
SSH_KEY_FILE="${HOME}/.ssh/$(hostname -s)_key"

if [ -f "$SSH_KEY_FILE" ]; then
  ssh-add -q "$SSH_KEY_FILE" || {
    echo "Failed to add key" >&2
    exit 1
  }
fi

# Rest of your automation...