Push vs Pull Configuration Management: Scalability Tradeoffs and Implementation Challenges in Infrastructure Automation

In configuration management systems, the push-pull dichotomy creates distinct operational patterns. Pull-based systems like Puppet/Chef implement a continuous synchronization model where agents periodically check the control server (typically every 30 minutes). This creates inherent load distribution:

# Puppet agent's periodic check crontab
*/30 * * * * /opt/puppetlabs/bin/puppet agent --no-daemonize

Push systems like Ansible operate through orchestrated execution bursts. The control node initiates SSH connections simultaneously, requiring careful thread management:

# Ansible playbook execution with fork control
ansible-playbook site.yml -f 50 --forks=50

Metric	Push Model	Pull Model
Connection Initiation	Controller → Nodes (bursty)	Nodes → Controller (distributed)
Failure Handling	Requires retry logic	Built-in through polling
New Node Bootstrap	Manual intervention needed	Automatic registration

Rackspace's demonstrated capability with Ansible highlights that push systems can scale when implementing:

Connection pipelining (SSH multiplexing)
Delta-based change propagation
Hierarchical execution topology

# SSH multiplexing configuration in ansible.cfg
[ssh_connection]
ssh_args = -o ControlMaster=auto -o ControlPersist=60s
control_path = ~/.ssh/ansible-%%r@%%h:%%p

The infrastructure.org critique reveals fundamental concurrency challenges in push systems:

# Problematic push implementation pattern
for host in $(cat hostlist); do
  scp configs/* $host:/etc/ &  # Backgrounding causes socket exhaustion
done

Versus proper threaded implementation:

# Python example using ThreadPoolExecutor
from concurrent.futures import ThreadPoolExecutor

def push_config(host):
    with SSHClient() as ssh:
        ssh.connect(host)
        ssh.put_files('/etc/')

with ThreadPoolExecutor(max_workers=50) as executor:
    executor.map(push_config, hostlist)

Modern solutions like SaltStack demonstrate hybrid approaches where minions can operate in both push and pull modes:

# SaltStack multi-mode configuration
# Push mode:
salt '*' state.apply

# Pull mode (scheduled):
schedule:
  highstate:
    function: state.apply
    minutes: 30

The architectural choice ultimately depends on:

Network topology constraints
Change propagation urgency
Node churn rate
Security model requirements

In configuration management systems, the push vs pull debate centers around how configuration updates propagate through infrastructure. Pull-based systems like Puppet and Chef have clients periodically check a central server for updates, while push-based systems like Ansible initiate changes from a control node.

The primary advantage of pull-based systems emerges in large-scale deployments:


# Example Puppet agent configuration (pull-based)
[agent]
server = puppet-master.example.com
runinterval = 1800  # Check every 30 minutes

This architecture naturally handles:

Client-initiated connections that don't overwhelm the server
Built-in retry mechanisms when clients are offline
Easier horizontal scaling of masters

While Rackspace demonstrates push can scale to 15k nodes, implementation becomes complex:


# Naive push implementation that doesn't scale
for host in $(cat hostlist); do
  ssh $host "sudo apt-get update && sudo apt-get upgrade -y"
done

The problems with this approach include:

Connection timeouts for offline nodes
TCP socket exhaustion
No built-in retry mechanism
Parallel execution complexity

Here's how modern systems handle these challenges:

Pull-Based Optimization


# Chef client configuration with optimized pull
chef_client_updater 'Install latest Chef' do
  version 'latest'
  post_install_action 'kill'
end

Push-Based Scaling Solutions


# Ansible playbook with scaling optimizations
- hosts: all
  serial: 50
  max_fail_percentage: 5
  tasks:
    - name: Apply security updates
      apt:
        update_cache: yes
        upgrade: dist

Consider these factors when selecting an architecture:

Factor	Pull Advantage	Push Advantage
Offline nodes	✔️ Automatic retry	❌ Manual handling
Immediate changes	❌ Polling delay	✔️ Instant
Network topology	✔️ Works behind NAT	❌ Requires connectivity

Some systems combine both models effectively:


# SaltStack hybrid example
# Master-minion (pull) configuration:
file_client: remote

# Masterless (push) configuration:
file_client: local

The key is matching your architecture to:

Infrastructure size
Change frequency
Operational team size
Network constraints

ServerDevWorker

Push vs Pull Configuration Management: Scalability Tradeoffs and Implementation Challenges in Infrastructure Automation

Pull-Based Optimization

Push-Based Scaling Solutions

Related Articles