Understanding rsync’s Multi-Process Behavior: Why It Spawns Child Processes and How to Control It


2 views

When examining process trees during an rsync operation, many administrators notice unexpected child processes. Here's what's actually happening in your case:

├─cron───cron───rsync───rsync───rsync

This hierarchy shows:

  1. The main cron daemon process
  2. The cron job instance for your specific schedule
  3. The parent rsync process
  4. Two child rsync processes

Rsync creates multiple processes by design to handle different aspects of the synchronization:

  • Parent process (PID 9972 in your case): Manages the overall operation and communication
  • Generator process (PID 9973): Scans and builds the file list
  • Receiver process (PID 9974): Handles the actual file transfers

This architecture provides several benefits:

1. Parallel processing of file listing and transfer operations
2. Better handling of large directory structures
3. More efficient memory management

While the multi-process model is generally optimal, there are cases where you might prefer single-process operation:

# Example: Running on resource-constrained systems
# When you need simpler process tracking
# For certain types of debugging scenarios

You can influence rsync's process behavior with these options:

--no-detach        # Keeps rsync attached to the terminal
--server           # Forces server mode (affects process creation)
--daemon           # Runs as a daemon (creates even more processes)

However, none of these will truly give you a single-process operation. The closest alternative is:

rsync --no-detach -ac --delete /source /dest

This will still create child processes, but the parent won't detach.

If you absolutely need single-process behavior, consider these alternatives:

# Using cp with appropriate options
cp -a --remove-destination /source/. /dest/

# Using tar for efficient transfers
(cd /source && tar cf - .) | (cd /dest && tar xf -)

Remember that these alternatives won't provide rsync's delta-transfer efficiency.

To better understand what's happening, use these monitoring commands:

# Show process tree
pstree -p | grep rsync

# Detailed process listing
ps auxf | grep rsync

# Real-time monitoring
watch -n 1 'pstree -p | grep rsync'

When executing rsync via cron for local file synchronization, many administrators notice an unexpected process tree structure:

├─cron───cron───rsync───rsync───rsync

This manifests in process listings as three distinct rsync processes:

9972 ?        Ds     1:00 rsync -ac --delete /source/folder /dest/folder
9973 ?        S      0:29 rsync -ac --delete /source/folder /dest/folder
9974 ?        S      0:09 rsync -ac --delete /source/folder /dest/folder

rsync employs a client-server architecture even for local operations. The three processes represent:

  1. The master process coordinating the operation
  2. The sender process handling source files
  3. The receiver process managing destination files

This design enables efficient pipelining and error isolation during transfers. The separation allows rsync to:

  • Parallelize checksum calculations and file transfers
  • Maintain clean process boundaries for resource management
  • Implement proper privilege separation when needed

While generally beneficial, this behavior can cause issues when:

* Running in resource-constrained environments
* When process count limitations exist
* During containerized operations where PID space is limited

To force single-process operation, use the --no-detach flag:

19 21 * * * root rsync -ac --no-detach --delete /source/folder /dest/folder

For modern rsync versions (3.2.0+), alternative approaches include:

# Using --daemon with --no-detach
rsync --daemon --no-detach --config=/path/to/rsyncd.conf

# For simple copies, consider alternative tools:
cp -a --reflink=auto /source/folder /dest/folder

Before disabling multiprocessing, benchmark your operations:

# Multiprocess version
time rsync -ac --delete /source/folder /dest/folder

# Single-process version
time rsync -ac --no-detach --delete /source/folder /dest/folder

In most cases, the default multiprocess model shows better throughput, especially for:

  • Large directory trees with many small files
  • Network transfers (even localhost)
  • Systems with multiple CPU cores

For those needing strict process control, consider wrapping rsync in a process manager:

#!/bin/bash
flock -n /tmp/rsync.lock -c "rsync -ac --delete /source/folder /dest/folder"

Then call the wrapper from cron instead of direct rsync execution.

When troubleshooting, examine process relationships with:

ps -ef --forest | grep rsync
pstree -p | grep rsync
ls -l /proc/$(pgrep rsync)/task/