When debugging complex applications or investigating security incidents, developers often need to track every file a process accesses throughout its entire execution. While lsof
shows currently open files, it doesn't provide historical access data. Here are more comprehensive approaches:
The most direct method is tracing system calls using strace
:
strace -f -e trace=open,openat,close,read,write -o /tmp/trace.log ./your_program
This will log all file operations including:
- File openings (
open
,openat
) - File closings (
close
) - Read/write operations (optional)
For enterprise-grade monitoring, configure audit rules:
# Monitor all files opened by specific process
auditctl -a exit,always -F arch=b64 -S openat -F pid=1234
# Or monitor files in specific directory
auditctl -w /path/to/watch -p war -k file_access
View logs using ausearch
or aureport
utilities.
For applications you control, consider adding inotify hooks:
# Python example using pyinotify
import pyinotify
wm = pyinotify.WatchManager()
mask = pyinotify.IN_OPEN | pyinotify.IN_CLOSE
class EventHandler(pyinotify.ProcessEvent):
def process_IN_OPEN(self, event):
print(f"File opened: {event.pathname}")
def process_IN_CLOSE(self, event):
print(f"File closed: {event.pathname}")
notifier = pyinotify.Notifier(wm, EventHandler())
wdd = wm.add_watch('/path/to/watch', mask, rec=True)
notifier.loop()
For advanced users, kernel probes can trace VFS operations:
# Trace all open() syscalls
echo 'p:myprobe do_sys_open filename=+0(%si):string' > /sys/kernel/debug/tracing/kprobe_events
echo 1 > /sys/kernel/debug/tracing/events/kprobes/myprobe/enable
cat /sys/kernel/debug/tracing/trace_pipe
Remember that comprehensive tracing adds overhead:
- strace can slow execution by 10-100x
- auditd has significant memory impact at scale
- Inotify works best for targeted directories
Other utilities worth considering:
fatrace
- Filesystem activity monitoropensnoop
from BPF toolssysdig
- Container-aware system exploration
When debugging complex applications or investigating security issues, developers often need to track every file accessed by a process throughout its entire lifetime. While tools like lsof
show currently open files, they don't provide historical access data.
The most reliable method is using strace
to monitor system calls:
strace -f -e trace=open,openat,close,creat,execve \
-o process_trace.log \
-s 1024 \
./your_application
This command will log all file-related operations including:
- File openings (
open
,openat
) - File creations (
creat
) - Execution of new programs (
execve
)
For production systems, Linux's audit subsystem provides more robust tracking:
# Install auditd if not present
sudo apt install auditd
# Add a watch rule for a specific PID
sudo auditctl -a exit,always -F arch=b64 -S openat -F pid=1234
# View the logs
sudo ausearch -p 1234 -i
For applications where you can control the execution environment, intercepting file operations via library preloading can be effective:
// file_tracer.c
#define _GNU_SOURCE
#include <dlfcn.h>
#include <stdio.h>
typedef int (*orig_open_func_t)(const char *pathname, int flags, ...);
int open(const char *pathname, int flags, ...) {
orig_open_func_t orig_open;
orig_open = (orig_open_func_t)dlsym(RTLD_NEXT, "open");
fprintf(stderr, "File accessed: %s\n", pathname);
return orig_open(pathname, flags);
}
// Compile with: gcc -shared -fPIC -ldl -o file_tracer.so file_tracer.c
// Usage: LD_PRELOAD=./file_tracer.so ./your_application
While these methods are powerful, they impact performance:
strace
can slow execution by 10-100x- Auditd adds moderate overhead but is more efficient
- LD_PRELOAD has the least overhead but requires recompilation
For complex applications, processing the logs can reveal valuable patterns:
# Count file accesses by type
awk '/openat/ {print $NF}' process_trace.log | sort | uniq -c | sort -nr
# Generate a timeline
grep 'openat' process_trace.log | awk '{print $1, $NF}' > access_timeline.dat