Timezone Discrepancy in find -mtime: Why Files with +0400 Offset Get Missed in Linux File Searches


8 views

When running find -mtime commands on our CentOS 5.5 server, we noticed something peculiar: some files that should match our time-based criteria were mysteriously excluded from the results. The issue became particularly apparent when comparing these two backup files:

-rw-r--r-- 1 root root    347253 Jun 12 16:26 pedia_main.2010-06-12-04-25-02.sql.gz
-rw-r--r-- 1 root root 490144578 Nov 24 16:26 gsmforum_main.2010-11-24-04-25-02.sql.gz

The first file wouldn't appear in find . -mtime 1 results while the second one would, despite both appearing similarly in regular directory listings.

The breakthrough came when examining the files with ls --full-time:

-rw-r--r-- 1 root root    347253 2010-06-12 16:26:20.000000000 +0400 pedia_main.2010-06-12-04-25-02.sql.gz
-rw-r--r-- 1 root root 490144578 2010-11-24 16:26:12.000000000 +0300 gsmforum_main.2010-11-24-04-25-02.sql.gz

Here we see the critical difference: the first file has a +0400 timezone offset while the second has +0300. This explains why find treats them differently when calculating modification times.

The find command calculates file ages based on:

  1. Current system time (in UTC)
  2. File's modification timestamp (including timezone offset)
  3. 24-hour periods (for -mtime calculations)

The timezone offset affects how the modification time is interpreted relative to UTC. When the offset changes (like during daylight saving transitions), files may unexpectedly fall outside the expected time range.

Here are three approaches to handle timezone discrepancies:

# 1. Use find's -daystart flag to normalize time calculations
find . -daystart -mtime 7

# 2. Explicitly convert timestamps using find's -newerXY options
find . -newermt "2023-01-01 00:00:00 +0000" ! -newermt "2023-01-08 00:00:00 +0000"

# 3. Use stat to inspect exact timestamps before filtering
stat -c '%y %n' * | grep "2023-01"

Contrary to some expectations, the timezone offset isn't stored in the file's inode or extended attributes. Instead, it's:

  • Calculated from the system's timezone database
  • Applied when displaying or interpreting timestamps
  • Particularly relevant for files created during DST transitions

On ext4 filesystems (like in our case), the modification time is stored as seconds since epoch (UTC), but the local timezone interpretation can vary based on system settings.

For more predictable results across timezones, consider:

# Using UTC explicitly with GNU date
find . -mtime $(date -u +%s -d '7 days ago')

# Perl one-liner for precise control
perl -e 'use File::Find; find(sub { print "$File::Find::name\n" if -M $_ <= 7 }, ".")'

When dealing with file searches across timezones or DST changes:

  • Always check full timestamps with ls --full-time or stat
  • Consider normalizing your environment to UTC when possible
  • Test find commands with known files before relying on them in scripts
  • Document timezone assumptions in your backup/cleanup procedures

While performing routine maintenance on a CentOS 5.5 server, I encountered a puzzling behavior with the find -mtime command. Some files that should have matched the search criteria were mysteriously missing from results, despite appearing perfectly normal in ls -l output.

$ ls -l
-rw-r--r-- 1 root root    347253 Jun 12 16:26 pedia_main.2010-06-12-04-25-02.sql.gz
-rw-r--r-- 1 root root 490144578 Nov 24 16:26 gsmforum_main.2010-11-24-04-25-02.sql.gz

The breakthrough came when using ls --full-time, revealing timezone differences:

$ ls --full-time
-rw-r--r-- 1 root root    347253 2010-06-12 16:26:20.000000000 +0400 pedia_main.2010-06-12-04-25-02.sql.gz
-rw-r--r-- 1 root root 490144578 2010-11-24 16:26:12.000000000 +0300 gsmforum_main.2010-11-24-04-25-02.sql.gz

The find command calculates file ages based on UTC timestamps, while ls typically displays local time. This becomes problematic when:

  • Files were created during daylight saving time transitions
  • The system timezone changed at some point
  • Files were transferred between servers with different timezones

Option 1: Use find -daystart to anchor at midnight local time:

find . -daystart -mtime 7

Option 2: Explicitly convert timestamps to UTC:

find . -printf "%T@ %p\n" | awk -v cutoff=$(date -d '7 days ago' +%s) '$1 < cutoff {print $2}'

Option 3: Use stat to inspect exact timestamps:

stat -c '%y %n' *.gz

Contrary to initial assumptions, timezone information isn't stored in the filesystem. The displayed offset comes from:

  1. The raw UTC timestamp stored in the inode
  2. The current system timezone setting during display

For consistent behavior across servers:

# Set system timezone to UTC
sudo ln -sf /usr/share/zoneinfo/UTC /etc/localtime

# For time-sensitive operations, always specify TZ
TZ=UTC find . -mtime 7