In Linux/Unix systems, when a file is deleted while still being held open by a process, the data remains accessible through /proc/<pid>/fd/<N>
until the process closes it. However, this isn't ideal for several reasons:
- The file disappears from the normal filesystem namespace
- You can't access it through normal paths
- Simple
cat
operations may not capture continuously updated data
When a file is deleted in Unix-like systems:
1. The directory entry is removed 2. The link count in the inode is decremented 3. If link count reaches 0, space is marked as free
But if a process has the file open, the inode and data blocks aren't actually freed until all file descriptors are closed.
Here are several approaches to recover the file:
Method 1: Copy Through /proc (Basic Recovery)
# Find the process holding the file $ lsof | grep deleted # Copy the contents $ cat /proc/1234/fd/15 > recovered_file.txt
Limitation: This creates a static copy and won't reflect ongoing writes.
Method 2: Recreate the File Link (Advanced)
For ext2/3/4 filesystems, you can use debugfs to relink the inode:
# First identify the inode number $ ls -li /proc/1234/fd/15 # Then in debugfs: $ sudo debugfs -w /dev/sda1 debugfs> ln <inode_number> /path/to/recovery
Warning: This directly manipulates filesystem metadata and can be dangerous.
Method 3: Using gdb (For Critical Cases)
For processes holding critical files open:
$ sudo gdb -p 1234 (gdb) call open("/proc/self/fd/15", O_RDONLY) (gdb) call dup2($1, 100) # Preserve the fd (gdb) call close(15) (gdb) quit
This preserves access while allowing normal filesystem operations.
For production systems, consider these preventative measures:
- Implement proper file locking mechanisms
- Use version control for critical files
- Set up monitoring for file deletions
Here's a Python script that monitors and protects critical files:
#!/usr/bin/env python3 import os import inotify.adapters def protect_file(filename): i = inotify.adapters.Inotify() i.add_watch(os.path.dirname(filename)) for event in i.event_gen(): if event is not None: header, type_names, path, name = event if 'IN_DELETE' in type_names and name == os.path.basename(filename): print(f"ALERT: {filename} deleted!") # Take recovery action here
We've all been there - accidentally deleting a critical log file or data file that's still being written to by a long-running process. While you can access the content through /proc/<pid>/fd/N
, this approach has significant limitations:
# Just accessing the content isn't enough
cat /proc/1234/fd/15 > recovered_file.log
The fundamental issue is that the filesystem namespace link to the inode is gone, even though the process maintains an open file descriptor.
When you're dealing with an actively written file, simply reading from the file descriptor has several drawbacks:
- You get a static copy, not a live reference
- The process can't be easily restarted without losing data
- You lose all filesystem metadata and permissions
- It doesn't solve the namespace visibility problem
While debugfs
can technically relink files, it's extremely risky:
debugfs -w /dev/sda1
debugfs: ln <inode_number> /path/to/recovered_file
This directly manipulates filesystem structures and can lead to:
- Filesystem corruption
- Inconsistent link counts
- Unrecoverable metadata issues
Here's a more reliable approach that works with most modern Linux systems:
# 1. Find the process and file descriptor
lsof | grep deleted | grep yourfilename
# 2. Get the actual file descriptor path
ls -l /proc/<pid>/fd/ | grep deleted
# 3. Create a new hard link to the inode
sudo bash -c 'ln "$(readlink -f /proc/<pid>/fd/<fdnum>)" /path/to/new/location'
This method properly recreates the directory entry while maintaining all file attributes.
When recovering files this way:
- Ensure you have root privileges for the final ln operation
- The original process must keep running until recovery is complete
- Check filesystem integrity afterwards (fsck may be needed)
- Consider file permissions - you may need to chown afterwards
For frequent needs, you might create a recovery script:
#!/bin/bash
PID=$1
FD=$2
DEST=$3
if [ -z "$PID" ] || [ -z "$FD" ] || [ -z "$DEST" ]; then
echo "Usage: $0 <pid> <fd> <destination>"
exit 1
fi
SRC="/proc/$PID/fd/$FD"
if [ ! -e "$SRC" ]; then
echo "File descriptor $FD not found for process $PID"
exit 1
fi
sudo bash -c 'ln "$(readlink -f '"$SRC"')" "'"$DEST"'"'