Troubleshooting NTP Daemon Sync Failures in Ubuntu: Why ntpd Only Works After Service Restart


6 views

What you're describing is a classic case of ntpd service degradation where:

  • Time sync works immediately after service restart
  • After approximately 24 hours, ntpq -p fails with "Connection refused"
  • The daemon appears to be running but stops responding to queries
  • Service restart temporarily resolves the issue

First, verify current ntpd status with:

systemctl status ntp
ntpq -pn
timedatectl status

Check for these critical indicators:

  • PID changes in system logs suggesting crashes
  • Network interface binding issues (visible in your syslog output)
  • Firewall rules blocking NTP traffic (UDP port 123)

The multiple interface bindings in your syslog suggest a likely culprit:

Listen normally on 3 eth0 xx.xxx.xxx.xxx UDP 123
Listen normally on 4 eth0:1 xx.xxx.xxx.xxx UDP 123
[...]

Potential issues:

  1. Interface flapping: Virtual interfaces (eth0:X) going down may destabilize ntpd
  2. Memory leaks: Older ntpd versions (like your 4.2.6p2) had known resource issues
  3. Kernel timekeeping conflicts: Check for adjtimex or hwclock interference

1. Restrict interface binding in /etc/ntp.conf:

interface ignore wildcard
interface listen eth0

2. Upgrade to modern ntpd (or switch to chrony):

sudo apt-get install ntp -y
# OR for newer Ubuntu
sudo apt-get install chrony -y

Add monitoring to detect sync failures:

#!/bin/bash
if ! ntpq -pn | grep -q '*'; then
    systemctl restart ntp
    echo "$(date) - NTP restart triggered" >> /var/log/ntp-watchdog.log
fi

Set this as a cron job running hourly:

0 * * * * /usr/local/bin/ntp-watchdog.sh

Consider these architecture changes:

  • Deploy local NTP server with ntpd or chronyd
  • Use systemd-timesyncd as fallback
  • Implement PTP for microsecond accuracy requirements

After inheriting a seemingly stable Ubuntu 11.10 server that had been running without issues for months, I encountered an odd time synchronization problem. The system clock would consistently drift by exactly one hour after approximately 24 hours of uptime. Standard NTP troubleshooting revealed this pattern:

# Typical failure symptoms:
ntpq -q
ntpq: read: Connection refused

# Temporary fix via service restart:
sudo service ntp restart
* Stopping NTP server ntpd
start-stop-daemon: warning: failed to kill 26915: No such process
* Starting NTP server ntpd

The log entries show multiple SIGTERM signals (signal 15) terminating the NTP daemon, followed by interface binding events when restarted:

# Sample from /var/log/syslog:
Feb 13 11:18:38 ntpd[27108]: ntpd exiting on signal 15
Feb 13 11:18:40 serverx ntpd[29252]: Listen normally on 3 eth0 xx.xxx.xxx.xxx UDP 123

Three key observations emerge:

  1. The daemon dies approximately daily
  2. Multiple virtual interfaces (eth0:1 through eth0:8) appear in logs
  3. Process termination leaves stale PID files

The server's configuration shows excessive virtual interfaces binding to NTP:

Listen normally on 3 eth0 xx.xxx.xxx.xxx UDP 123
Listen normally on 4 eth0:1 xx.xxx.xxx.xxx UDP 123
[...]
Listen normally on 11 eth0:8 xx.xxx.xxx.xxx UDP 123

This suggests potential network stack issues. Try limiting NTP binding:

# In /etc/ntp.conf add:
interface ignore wildcard
interface listen eth0

For systems where NTPD keeps dying, consider these approaches:

# Option 1: Monitor and restart via cron
*/30 * * * * /usr/bin/pgrep ntpd || /etc/init.d/ntp restart

# Option 2: Use systemd watchdog (for newer systems)
[Service]
Restart=on-failure
RestartSec=5s

For Ubuntu 11.10 (now EOL), upgrading to chrony provides better stability:

sudo apt-get install chrony
sudo service chrony restart
chronyc sources -v

Key chrony advantages:

  • Better handling of intermittent networks
  • Faster sync after sleep/resume
  • More stable with virtual interfaces
  1. Verify NTP server choices in /etc/ntp.conf
  2. Check for duplicate bindings
  3. Review system logs for interface changes
  4. Consider timezone settings (tzdata)
  5. Monitor with: watch -n 60 ntpq -pn