Push vs Pull Configuration Management: Scalability Tradeoffs and Implementation Challenges in Infrastructure Automation


1 views

In configuration management systems, the push-pull dichotomy creates distinct operational patterns. Pull-based systems like Puppet/Chef implement a continuous synchronization model where agents periodically check the control server (typically every 30 minutes). This creates inherent load distribution:

# Puppet agent's periodic check crontab
*/30 * * * * /opt/puppetlabs/bin/puppet agent --no-daemonize

Push systems like Ansible operate through orchestrated execution bursts. The control node initiates SSH connections simultaneously, requiring careful thread management:

# Ansible playbook execution with fork control
ansible-playbook site.yml -f 50 --forks=50
Metric Push Model Pull Model
Connection Initiation Controller → Nodes (bursty) Nodes → Controller (distributed)
Failure Handling Requires retry logic Built-in through polling
New Node Bootstrap Manual intervention needed Automatic registration

Rackspace's demonstrated capability with Ansible highlights that push systems can scale when implementing:

  • Connection pipelining (SSH multiplexing)
  • Delta-based change propagation
  • Hierarchical execution topology
# SSH multiplexing configuration in ansible.cfg
[ssh_connection]
ssh_args = -o ControlMaster=auto -o ControlPersist=60s
control_path = ~/.ssh/ansible-%%r@%%h:%%p

The infrastructure.org critique reveals fundamental concurrency challenges in push systems:

# Problematic push implementation pattern
for host in $(cat hostlist); do
  scp configs/* $host:/etc/ &  # Backgrounding causes socket exhaustion
done

Versus proper threaded implementation:

# Python example using ThreadPoolExecutor
from concurrent.futures import ThreadPoolExecutor

def push_config(host):
    with SSHClient() as ssh:
        ssh.connect(host)
        ssh.put_files('/etc/')

with ThreadPoolExecutor(max_workers=50) as executor:
    executor.map(push_config, hostlist)

Modern solutions like SaltStack demonstrate hybrid approaches where minions can operate in both push and pull modes:

# SaltStack multi-mode configuration
# Push mode:
salt '*' state.apply

# Pull mode (scheduled):
schedule:
  highstate:
    function: state.apply
    minutes: 30

The architectural choice ultimately depends on:

  1. Network topology constraints
  2. Change propagation urgency
  3. Node churn rate
  4. Security model requirements

In configuration management systems, the push vs pull debate centers around how configuration updates propagate through infrastructure. Pull-based systems like Puppet and Chef have clients periodically check a central server for updates, while push-based systems like Ansible initiate changes from a control node.

The primary advantage of pull-based systems emerges in large-scale deployments:


# Example Puppet agent configuration (pull-based)
[agent]
server = puppet-master.example.com
runinterval = 1800  # Check every 30 minutes

This architecture naturally handles:

  • Client-initiated connections that don't overwhelm the server
  • Built-in retry mechanisms when clients are offline
  • Easier horizontal scaling of masters

While Rackspace demonstrates push can scale to 15k nodes, implementation becomes complex:


# Naive push implementation that doesn't scale
for host in $(cat hostlist); do
  ssh $host "sudo apt-get update && sudo apt-get upgrade -y"
done

The problems with this approach include:

  • Connection timeouts for offline nodes
  • TCP socket exhaustion
  • No built-in retry mechanism
  • Parallel execution complexity

Here's how modern systems handle these challenges:

Pull-Based Optimization


# Chef client configuration with optimized pull
chef_client_updater 'Install latest Chef' do
  version 'latest'
  post_install_action 'kill'
end

Push-Based Scaling Solutions


# Ansible playbook with scaling optimizations
- hosts: all
  serial: 50
  max_fail_percentage: 5
  tasks:
    - name: Apply security updates
      apt:
        update_cache: yes
        upgrade: dist

Consider these factors when selecting an architecture:

Factor Pull Advantage Push Advantage
Offline nodes ✔️ Automatic retry ❌ Manual handling
Immediate changes ❌ Polling delay ✔️ Instant
Network topology ✔️ Works behind NAT ❌ Requires connectivity

Some systems combine both models effectively:


# SaltStack hybrid example
# Master-minion (pull) configuration:
file_client: remote

# Masterless (push) configuration:
file_client: local

The key is matching your architecture to:

  • Infrastructure size
  • Change frequency
  • Operational team size
  • Network constraints