When managing infrastructure at scale, you'll often encounter situations where you need to retrieve log files, configuration backups, or diagnostic data from multiple servers. Ansible's built-in fetch
module has a significant limitation - it only handles single-file transfers per task execution.
# This only fetches one file
- name: Fetch single file
ansible.builtin.fetch:
src: /var/log/app/error.log
dest: /backup/logs/
flat: yes
1. Using find + fetch with register
This approach dynamically discovers files and processes them sequentially:
- name: Find all target files
ansible.builtin.find:
paths: /var/log/app/
patterns: "*.log"
recurse: no
register: found_files
- name: Fetch all matched files
ansible.builtin.fetch:
src: "{{ item.path }}"
dest: "/backup/logs/{{ inventory_hostname }}/"
flat: yes
loop: "{{ found_files.files }}"
2. Parallel Transfer with async and async_status
For large file collections across many hosts, parallel execution improves performance:
- name: Async fetch operation
ansible.builtin.fetch:
src: "{{ item }}"
dest: "/backup/{{ inventory_hostname }}/"
loop: "{{ files_to_fetch }}"
async: 45
poll: 0
register: async_results
- name: Check async tasks
ansible.builtin.async_status:
jid: "{{ item.ansible_job_id }}"
loop: "{{ async_results.results }}"
register: async_poll_results
until: async_poll_results.finished
retries: 30
3. Custom Module for Complex Scenarios
When you need advanced features like compression or delta transfers:
# my_fetch_all.py (custom module)
from ansible.module_utils.basic import AnsibleModule
import os
import shutil
def main():
module = AnsibleModule(
argument_spec=dict(
src_dir=dict(type='str', required=True),
dest_base=dict(type='str', required=True),
pattern=dict(type='str', default='*')
)
)
# Implementation logic here
# ...
- Always include
inventory_hostname
in destination paths to avoid conflicts - Set appropriate file permissions with
mode
parameter - For large transfers, consider using
throttle
to limit bandwidth - Implement proper error handling with
ignore_errors
andfailed_when
When dealing with hundreds of servers:
- name: Optimized batch fetch
ansible.builtin.fetch:
src: "/data/{{ item }}"
dest: "/archive/{{ inventory_hostname }}/"
loop: "{{ query('fileglob', '/data/*.tar.gz') }}"
throttle: 10
become: false # Reduces privilege escalation overhead
vars:
ansible_ssh_pipelining: true
ansible_scp_if_ssh: true
When managing multiple servers, there's often a need to collect log files, configuration backups, or diagnostic data from identical directory structures across your infrastructure. While Ansible's fetch
module works perfectly for single files, it presents limitations when dealing with multiple files or entire directories.
- name: Gather multiple files using find + fetch combo
hosts: webservers
tasks:
- name: Find all .log files in /var/log/app/
ansible.builtin.find:
paths: /var/log/app/
patterns: "*.log"
recurse: yes
register: found_files
- name: Fetch each found file
ansible.builtin.fetch:
src: "{{ item.path }}"
dest: "/tmp/ansible_fetched/{{ inventory_hostname }}/"
flat: yes
loop: "{{ found_files.files }}"
The solution combines two powerful Ansible modules:
- find: Recursively locates files matching specific patterns
- fetch: Transfers files while maintaining host-based directory structure
For more complex file selection criteria:
- name: Find files modified in last 24 hours
ansible.builtin.find:
paths: /opt/backups/
age: "-1d"
size: "+1M"
file_type: "file"
register: recent_backups
When dealing with hundreds of files across dozens of servers:
- Use
throttle
parameter to limit concurrent transfers - Consider
async
mode for long-running operations - Implement
serial
execution for resource-constrained environments
Make your playbook resilient with proper error handling:
- name: Safely fetch files with error tolerance
ansible.builtin.fetch:
src: "{{ item.path }}"
dest: "/collected/{{ inventory_hostname }}/"
loop: "{{ found_files.files }}"
ignore_errors: yes
when: found_files.matched > 0
For collecting Apache access logs from a web server cluster:
- name: Collect rotated access logs
ansible.builtin.find:
paths: /var/log/httpd/
patterns: "access.log*"
register: apache_logs
- name: Fetch logs with date-based organization
ansible.builtin.fetch:
src: "{{ item.path }}"
dest: "/analytics/logs/{{ inventory_hostname }}/{{ ansible_date_time.date }}/"
flat: no
loop: "{{ apache_logs.files }}"