Automating SSH Known Hosts for GitHub in Bash Scripts: A Secure and Idempotent Approach


10 views

When provisioning new servers, manually adding GitHub as a known host via ssh git@github.com becomes tedious and breaks automation workflows. The standard approach requires interactive confirmation, which isn't suitable for scripted environments.

The ssh-keyscan utility provides the perfect solution by fetching public keys non-interactively:

# Basic implementation
ssh-keyscan github.com >> ~/.ssh/known_hosts

To ensure the script can run multiple times without duplicate entries:

#!/bin/bash

KNOWN_HOSTS_FILE="$HOME/.ssh/known_hosts"
GITHUB_DOMAIN="github.com"

# Create .ssh directory if it doesn't exist
mkdir -p ~/.ssh
chmod 700 ~/.ssh

# Remove existing GitHub entries if present
ssh-keygen -R "$GITHUB_DOMAIN" -f "$KNOWN_HOSTS_FILE" >/dev/null 2>&1

# Add fresh GitHub keys
ssh-keyscan -H "$GITHUB_DOMAIN" >> "$KNOWN_HOSTS_FILE" 2>/dev/null

# Verify the keys were added
if grep -q "$GITHUB_DOMAIN" "$KNOWN_HOSTS_FILE"; then
    echo "✓ GitHub SSH keys successfully added"
else
    echo "✗ Failed to add GitHub SSH keys" >&2
    exit 1
fi

For comprehensive coverage, include all GitHub endpoints:

GITHUB_DOMAINS=(
    "github.com"
    "ssh.github.com"
    "gist.github.com"
)

for domain in "${GITHUB_DOMAINS[@]}"; do
    ssh-keygen -R "$domain" -f "$KNOWN_HOSTS_FILE" >/dev/null 2>&1
    ssh-keyscan -H "$domain" >> "$KNOWN_HOSTS_FILE" 2>/dev/null
done

The above approach is secure because:

  • ssh-keyscan only retrieves public keys
  • We use -H flag to hash hostnames in known_hosts
  • We remove existing entries before adding new ones
  • The operation fails visibly if key retrieval fails

This can be incorporated into larger provisioning scripts like Ansible, Terraform, or cloud-init:

#!/bin/bash

# Ensure all dependencies are installed
if ! command -v ssh-keyscan &> /dev/null; then
    apt-get update && apt-get install -y openssh-client
fi

# Add GitHub as known host
add_github_to_known_hosts() {
    # ... [previous implementation] ...
}

add_github_to_known_hosts

# Continue with other provisioning tasks

When scripting server provisioning, one common hurdle is handling SSH known hosts verification. The manual approach of running ssh git@github.com and accepting the host key doesn't scale in automated environments. Here's how to handle it properly in bash.

The SSH known_hosts file serves a crucial security purpose by preventing man-in-the-middle attacks. When automating:

  • We need to add GitHub's host key without manual intervention
  • The solution must be idempotent (safe to run multiple times)
  • Should work across different GitHub domains (github.com, gist.github.com, etc.)

Here's a complete bash function that handles all edge cases:

# Add GitHub to known hosts (idempotent operation)
add_github_to_known_hosts() {
    local known_hosts_file="$HOME/.ssh/known_hosts"
    local github_domains=("github.com" "gist.github.com")
    
    for domain in "${github_domains[@]}"; do
        if ! ssh-keygen -F "$domain" -f "$known_hosts_file" >/dev/null; then
            echo "Adding $domain to known hosts"
            ssh-keyscan -t rsa,ecdsa,ed25519 "$domain" >> "$known_hosts_file" 2>/dev/null
            # Verify the obtained keys
            ssh-keygen -lf "$known_hosts_file" | grep "$domain" || {
                echo "Failed to add $domain keys" >&2
                return 1
            }
        else
            echo "$domain already in known hosts"
        fi
    done
}

The solution works by:

  1. Checking if the domain already exists in known_hosts (using ssh-keygen -F)
  2. Only adding it if not present (idempotent operation)
  3. Scanning for multiple key types (RSA, ECDSA, Ed25519)
  4. Verifying the keys were actually added

For enterprise environments, consider this hardened version:

add_github_to_known_hosts_secure() {
    local temp_file=$(mktemp)
    local github_domains=("github.com" "gist.github.com")
    local key_types=("rsa" "ecdsa" "ed25519")
    
    for domain in "${github_domains[@]}"; do
        if ! ssh-keygen -F "$domain" >/dev/null; then
            echo "Fetching $domain keys securely..."
            for key_type in "${key_types[@]}"; do
                ssh-keyscan -t "$key_type" "$domain" 2>/dev/null | \
                while read -r line; do
                    if [[ "$line" == *"$domain"* ]]; then
                        echo "$line" >> "$temp_file"
                    fi
                done
            done
            
            # Verify against GitHub's published keys
            if curl -s https://api.github.com/meta | jq -r '.ssh_keys[]' | \
               grep -qf <(awk '{print $2}' "$temp_file"); then
                cat "$temp_file" >> ~/.ssh/known_hosts
                echo "Added verified $domain keys"
            else
                echo "Key verification failed for $domain" >&2
                rm -f "$temp_file"
                return 1
            fi
        fi
    done
    rm -f "$temp_file"
}

To use this in your provisioning workflow:

#!/bin/bash
# Server provisioning script

# Ensure .ssh directory exists
mkdir -p ~/.ssh
chmod 700 ~/.ssh

# Add GitHub hosts
if ! add_github_to_known_hosts; then
    echo "Failed to configure GitHub SSH access" >&2
    exit 1
fi

# Rest of your provisioning tasks...

Consider these additional scenarios:

  • Different users: The script should work whether run as root or a regular user
  • Strict host checking: Configure SSH to fail properly if hosts aren't in known_hosts
  • CI/CD environments: May need to use alternative locations for known_hosts

Example for CI environments:

export SSH_KNOWN_HOSTS="/tmp/known_hosts"
add_github_to_known_hosts() {
    # Same function but uses $SSH_KNOWN_HOSTS instead of ~/.ssh/known_hosts
}