Debugging Postfix SMTP Connection Refused Errors to Multiple IP Addresses: A Complete Guide


2 views

The error logs show Postfix attempting to connect to multiple IP addresses (10.41.0.101, 10.41.0.247, etc.) for the same domain (ab.xyz.com) on port 25, with consistent "Connection refused" or timeout errors. This indicates a DNS resolution issue where multiple A records exist for the target domain.

When you run dig ab.xyz.com or host ab.xyz.com, you'll likely see multiple A records returned:

$ dig ab.xyz.com
;; ANSWER SECTION:
ab.xyz.com.      3600    IN      A       10.41.0.101
ab.xyz.com.      3600    IN      A       10.41.0.102
ab.xyz.com.      3600    IN      A       10.41.0.110
ab.xyz.com.      3600    IN      A       10.41.0.135
ab.xyz.com.      3600    IN      A       10.41.0.247
ab.xyz.com.      3600    IN      A       10.40.40.130

Postfix follows RFC 2782 for DNS-based load balancing. When multiple A records exist, it:

  1. Randomly shuffles the IP list on first query
  2. Attempts connections sequentially
  3. Retries with exponential backoff (30s, 300s, 1800s, etc.)

For your specific case, consider these solutions:

Option 1: Enforce Single MX Record

Add this to /etc/postfix/main.cf:

smtp_skip_5xx_greeting = yes
smtp_skip_4xx_greeting = yes
smtp_randomize_addresses = no

Option 2: Modify DNS Resolver Behavior

Configure your resolver to prefer working hosts:

# In /etc/resolv.conf
options rotate
options attempts:2
options timeout:2

Option 3: Postfix Transport Maps

Force routing through specific gateways:

# /etc/postfix/transport
ab.xyz.com smtp:[10.41.0.101]:25

Then run:

postmap /etc/postfix/transport
postfix reload

After changes, verify with:

postconf -n | grep smtp_
postqueue -p
tail -f /var/log/mail.log

For long-term monitoring, consider setting up alerts for repeated deferrals using tools like:

# Sample Nagios check
define command {
    command_name check_postfix_deferred
    command_line /usr/lib/nagios/plugins/check_log -F /var/log/mail.log -O /tmp/mail.log.offset -q "status=deferred"
}

Each failed attempt consumes resources. Optimize with:

# /etc/postfix/main.cf
smtp_connect_timeout = 10s
smtp_helo_timeout = 10s
maximal_queue_lifetime = 1d
minimal_backoff_time = 30m

When examining Postfix mail logs, you're seeing repeated connection attempts to multiple IP addresses (10.41.0.101, 10.41.0.247, etc.) on port 25 for the domain ab.xyz.com. The pattern shows:

connect to ab.xyz.com[10.41.0.101]:25: Connection refused
connect to ab.xyz.com[10.40.40.130]:25: Connection timed out

The multiple IP attempts indicate Postfix is receiving multiple A records for ab.xyz.com. This is confirmed by:

dig ab.xyz.com A
;; ANSWER SECTION:
ab.xyz.com.    300    IN    A    10.41.0.101
ab.xyz.com.    300    IN    A    10.41.0.102
ab.xyz.com.    300    IN    A    10.40.40.130

Postfix implements RFC 2782 for SRV records and RFC 5321 for MX fallback. When delivering mail:

  1. Checks for MX records first
  2. Falls back to A records if no MX exists
  3. Iterates through all IPs in round-robin fashion

To control this behavior, modify /etc/postfix/main.cf:

# Preferred solution: Force MX lookup
smtp_skip_5xx_errors = no
smtp_skip_4xx_errors = no
ignore_mx_lookup_error = no

# Alternative: Limit IP attempts
smtp_destination_concurrency_limit = 2
smtp_destination_recipient_limit = 1

# Last resort: Override DNS resolution
transport_maps = hash:/etc/postfix/transport

For transport maps:

# /etc/postfix/transport
ab.xyz.com    smtp:[specific.mail.server]:25

# Then run:
postmap /etc/postfix/transport
postfix reload

Use these commands to diagnose:

postconf -n | grep -E 'transport|smtp'
postmap -q ab.xyz.com transport
postqueue -p

The current retry pattern shows exponential backoff:

  • First attempt: 30 seconds after queue activation
  • Subsequent attempts: 628s, 1527s, 3327s intervals

Adjust with:

minimal_backoff_time = 300
maximal_backoff_time = 4000