How to Configure Nagios Host Checks Using SSH/HTTP Instead of ICMP Ping


2 views

By default, Nagios uses ICMP ping to determine host availability, which becomes problematic in environments where ICMP is blocked. This creates false "down" alerts despite servers being operational through other protocols.

Nagios supports multiple plugin-based checks for host availability:


# SSH check example
define command {
    command_name    check_ssh_alive
    command_line    $USER1$/check_ssh -H $HOSTADDRESS$ -t 30
}

# HTTP check example
define command {
    command_name    check_http_alive
    command_line    $USER1$/check_http -H $HOSTADDRESS$ -I $HOSTADDRESS$ -t 30
}

Modify your host definition in nagios.cfg:


define host {
    host_name               webserver01
    alias                   Web Server
    address                 192.168.1.100
    check_command           check_ssh_alive  # Or check_http_alive
    max_check_attempts      3
    check_interval          5
    retry_interval          1
    check_period            24x7
    notification_interval   30
    notification_period     24x7
    notification_options    d,u,r
    contact_groups          admins
}

For comprehensive monitoring, implement multiple check methods:


define service {
    host_name               webserver01
    service_description     SSH_Availability
    check_command           check_ssh_alive
    check_interval          5
    retry_interval          1
}

define service {
    host_name               webserver01
    service_description     HTTP_Availability
    check_command           check_http_alive
    check_interval          5
    retry_interval          1
}

When replacing ping checks:

  • SSH checks add ~300ms overhead compared to ICMP
  • HTTP checks typically complete within 500ms
  • Adjust timeouts accordingly in check commands

Common issues and solutions:


# Verify plugin execution manually
/usr/lib/nagios/plugins/check_ssh -H 192.168.1.100

# Check Nagios debug logs
tail -f /var/log/nagios/nagios.debug

# Validate configuration
nagios -v /etc/nagios/nagios.cfg

When working in restricted network environments where ICMP/ping traffic is blocked, Nagios will incorrectly report servers as "down" despite them being operational. This occurs because Nagios defaults to using ping checks for basic host availability.

Nagios provides several robust alternatives to ICMP-based checks:

1. SSH-Based Host Alive Check

Create a custom check command in your Nagios configuration:

define command {
    command_name    check_ssh_alive
    command_line    $USER1$/check_ssh -H $HOSTADDRESS$ -p 22 -t 30
}

Then apply it to your host definition:

define host {
    use                     linux-server
    host_name               webserver1
    address                 192.168.1.100
    check_command           check_ssh_alive
    max_check_attempts      3
    ...
}

2. HTTP/HTTPS Service Check

For web servers, HTTP checks are often more reliable than SSH:

define command {
    command_name    check_http_alive
    command_line    $USER1$/check_http -H $HOSTADDRESS$ -I $HOSTADDRESS$ -t 30
}

define service {
    use                 generic-service
    host_name           webserver1
    service_description HTTP Availability
    check_command       check_http_alive
    check_interval      5
    retry_interval      1
}

For more complex scenarios, consider these approaches:

NRPE Checks: When direct SSH/HTTP access isn't possible, use NRPE for remote execution:

define command {
    command_name    check_nrpe_alive
    command_line    $USER1$/check_nrpe -H $HOSTADDRESS$ -c check_load
}

Combined Checks: Implement multiple verification methods:

define service {
    use                 generic-service
    host_name           appserver1
    service_description Combined Alive Check
    check_command       check_http_alive!check_ssh_alive
    check_interval      5
}

When replacing ping checks with service-based verification:

  • Increase check timeouts (30-60 seconds instead of default 10)
  • Adjust max_check_attempts to account for temporary service fluctuations
  • Monitor your Nagios server's resource usage as these checks are more intensive

If checks still show incorrect status:

# Verify check execution manually:
/usr/lib/nagios/plugins/check_http -H 192.168.1.100

# Check Nagios debug logs:
tail -f /var/log/nagios/nagios.debug | grep HOSTNAME

Remember to reload Nagios after configuration changes:

systemctl reload nagios