The classic headache scenario: your Perl service check scripts execute perfectly when run manually from the command line, but Nagios keeps showing those frustrating (Service check did not exit properly)
and (null)
statuses in the web interface. Been there, done that.
First, let's eliminate the obvious suspects:
# Bad example causing null output
#!/usr/bin/perl
print "Everything looks good!"; # Missing proper exit code
# Proper Nagios-compatible format
#!/usr/bin/perl -w
use strict;
print "OK: Service operational|perfdata=1\n";
exit 0; # Explicit exit code
Nagios runs checks with a stripped-down environment. Test this by running:
sudo -u nagios /path/to/your/script.pl
Common missing variables include PATH and PERL5LIB. Fix this by either:
# Option 1: Full paths in script
#!/usr/bin/perl
use lib '/full/path/to/modules';
# Option 2: Set in resource.cfg
$USER1$=/full/path/to/custom/perl/modules
Even with 755 permissions, you might hit additional restrictions:
# Diagnostic commands:
ls -lZ /path/to/script.pl
getenforce
ps auxZ | grep nagios
Temporary test solution (not recommended for production):
chcon -t nagios_system_plugin_exec_t /path/to/script.pl
Nagios might be killing slow scripts. Add debug timing:
#!/usr/bin/perl
my $start = time;
# ... script logic ...
warn "Execution time: ". (time - $start) ." seconds\n";
Adjust service check timeout in nagios.cfg:
service_check_timeout=60
When all else fails, implement proper logging:
#!/usr/bin/perl
open my $log, '>>', '/tmp/nagios_script.debug' or die $!;
print $log "Starting check at ".localtime."\n";
# ... main logic ...
print $log "Ending check at ".localtime."\n";
close $log;
For complex checks, consider wrapping the Perl script:
#!/bin/bash
output=$(/path/to/script.pl 2>&1)
status=$?
[ -z "$output" ] && output="(null)"
echo "$output"
exit $status
Or switch to NRPE for remote execution control:
command[check_service]=/usr/bin/perl /path/to/script.pl
When Perl-based service checks work perfectly via command line but fail in Nagios with cryptic messages like "(Service check did not exit properly)" or "(null)", we're typically dealing with environmental or execution context differences. Let's break down the potential causes and solutions.
The main suspects usually are:
- Environment variable differences between shell and Nagios
- Permission issues with the Nagios execution context
- Perl interpreter path discrepancies
- Output handling problems
- Timeout scenarios
First, let's modify your script to capture the actual execution environment. Create a test script like:
#!/usr/bin/perl
use strict;
use warnings;
# Print environment for debugging
open(my $fh, '>', '/tmp/nagios_env_debug.log');
print $fh "Perl path: $^X\n";
print $fh "UID: $<\n";
print $fh "GID: $(\n";
print $fh "ENV PATH: $ENV{PATH}\n";
close $fh;
# Standard Nagios output
print "OK: Test succeeded\n";
exit 0;
Compare the output when run manually versus through Nagios. The differences will reveal environmental gaps.
1. Explicit Path Specification
Ensure your script's shebang line uses the full Perl path:
#!/usr/bin/env perl
# becomes
#!/usr/local/bin/perl # or whatever which perl returns
2. Output Format Enforcement
Nagios requires specific output formats. Add strict output validation:
sub validate_output {
my ($output) = @_;
unless ($output =~ /^(OK|WARNING|CRITICAL|UNKNOWN):/) {
die "Invalid Nagios output format";
}
return 1;
}
3. Full Environment Initialization
Create a wrapper script that sets up the environment:
#!/bin/bash
export PATH=/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin
/path/to/your/perl/script.pl
For persistent issues, implement logging directly in your Perl script:
use Sys::Syslog qw(:standard :macros);
openlog('nagios_plugin', 'ndelay,pid', 'local0');
syslog(LOG_INFO, "Script started with @ARGV");
# ... your code ...
syslog(LOG_INFO, "Script completed");
closelog;
Here's a complete working template that handles common pitfalls:
#!/usr/local/bin/perl
use strict;
use warnings;
use Carp qw(croak);
BEGIN {
$ENV{PATH} = '/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin';
$ENV{PERL5LIB} = '/usr/local/lib/perl5/site_perl';
}
# Nagios-compliant exit handler
END {
my $exit_code = $? >> 8;
if ($exit_code > 3 || $exit_code < 0) {
print "UNKNOWN: Abnormal exit ($exit_code)\n";
exit 3;
}
}
# Your actual check logic here
eval {
# Simulate check
my $status = check_service();
if ($status eq 'OK') {
print "OK: Service operational\n";
exit 0;
}
# ... other status cases ...
};
if ($@) {
print "UNKNOWN: $@\n";
exit 3;
}
sub check_service {
# Implementation here
return 'OK';
}