Traditional file expiration methods using file attributes fall short when you need to:
- Track file residency time rather than modification/creation dates
- Maintain accuracy despite Windows Explorer access patterns
- Handle files moved or copied into the directory
Here's a PowerShell solution that creates a tracking database using file hashes:
# Initialize tracking database if not exists
$dbPath = "C:\FileExpiryTracker\fileTracking.db"
if (!(Test-Path $dbPath)) {
$trackingDB = @{}
$trackingDB | Export-Clixml -Path $dbPath
}
# Daily maintenance script
function Update-FileTracking {
param (
[string]$targetFolder,
[int]$expiryDays
)
$currentDB = Import-Clixml -Path $dbPath
$newFiles = Get-ChildItem -Path $targetFolder -File
# Update tracking for existing files
foreach ($file in $newFiles) {
$hash = (Get-FileHash $file.FullName -Algorithm SHA256).Hash
if (!$currentDB.ContainsKey($hash)) {
$currentDB[$hash] = Get-Date
}
}
# Remove expired entries
$cutoffDate = (Get-Date).AddDays(-$expiryDays)
$expiredHashes = $currentDB.GetEnumerator() |
Where-Object { $_.Value -lt $cutoffDate } |
Select-Object -ExpandProperty Key
# Delete expired files
$expiredHashes | ForEach-Object {
$matchingFiles = $newFiles | Where-Object {
(Get-FileHash $_.FullName -Algorithm SHA256).Hash -eq $_
}
$matchingFiles | Remove-Item -Force
$currentDB.Remove($_)
}
$currentDB | Export-Clixml -Path $dbPath
}
For better performance than hashing, consider using NTFS extended attributes to store arrival timestamps:
function Set-FileArrivalTime {
param (
[string]$filePath
)
$file = Get-Item $filePath
$arrivalTime = [byte[]][System.Text.Encoding]::Unicode.GetBytes((Get-Date).ToString())
$file.SetExtendedAttribute("ArrivalTime", $arrivalTime)
}
function Get-ExpiredFiles {
param (
[string]$folderPath,
[int]$days
)
Get-ChildItem $folderPath -File | Where-Object {
$arrivalBytes = $_.GetExtendedAttribute("ArrivalTime")
if ($arrivalBytes) {
$arrivalDate = [System.Text.Encoding]::Unicode.GetString($arrivalBytes) | Get-Date
$arrivalDate -lt (Get-Date).AddDays(-$days)
} else {
$false
}
}
}
When deploying this solution:
- Schedule the script as a daily task running under SYSTEM account
- Consider file locking during hash calculation
- Implement logging for audit purposes
- Add error handling for permission issues
For large directories, these optimizations help:
# Fast hash calculation for large files
function Get-FastFileHash {
param (
[string]$filePath,
[int]$sampleSizeMB = 10
)
$stream = [System.IO.File]::OpenRead($filePath)
$buffer = New-Object byte[] (1MB * $sampleSizeMB)
$bytesRead = $stream.Read($buffer, 0, $buffer.Length)
$stream.Close()
$hasher = [System.Security.Cryptography.SHA256]::Create()
$hashBytes = $hasher.ComputeHash($buffer, 0, $bytesRead)
[BitConverter]::ToString($hashBytes).Replace("-","")
}
Creating a self-cleaning shared folder in Windows presents unique technical hurdles when traditional file attributes prove unreliable. Administrators often need to automatically purge stale files based on when they were added to the directory, not when they were created or modified - a distinction most standard solutions fail to address.
Common methods using file metadata (creation date, modified date, or last accessed) don't work for drop folders because:
- Moving files preserves original creation dates
- Windows updates last accessed timestamps unpredictably
- Modified dates only change with content alterations
Here's a PowerShell implementation that combines file tracking with scheduled cleanup:
# FileTracker.ps1
$dropFolder = "\\server\shared\DropFolder"
$trackingDB = "\\server\shared\FileTracker.csv"
$retentionDays = 30
# Initialize tracking file if not exists
if (-not (Test-Path $trackingDB)) {
New-Item $trackingDB -ItemType File | Out-Null
"FileHash,DetectionDate" | Out-File $trackingDB -Encoding UTF8
}
# Get current files and their hashes
$currentFiles = Get-ChildItem $dropFolder -File |
ForEach-Object {
$hash = (Get-FileHash $_.FullName -Algorithm SHA256).Hash
[PSCustomObject]@{
FileHash = $hash
FilePath = $_.FullName
}
}
# Update tracking database
$trackedFiles = Import-Csv $trackingDB
$newEntries = $currentFiles | Where-Object { $_.FileHash -notin $trackedFiles.FileHash }
if ($newEntries) {
$newEntries | ForEach-Object {
"$($_.FileHash),$(Get-Date -Format 'yyyy-MM-dd')" | Out-File $trackingDB -Append -Encoding UTF8
}
}
# Cleanup old files
$cutoffDate = (Get-Date).AddDays(-$retentionDays)
$expiredFiles = $trackedFiles | Where-Object {
[datetime]$_.DetectionDate -lt $cutoffDate -and
$_.FileHash -notin $currentFiles.FileHash
}
$expiredFiles | ForEach-Object {
$matchingFile = $currentFiles | Where-Object { $_.FileHash -eq $_.FileHash }
if ($matchingFile) {
Remove-Item $matchingFile.FilePath -Force
Write-Host "Removed expired file: $($matchingFile.FilePath)"
}
}
For high-volume environments, consider using the NTFS USN Journal to track file additions more efficiently:
// UsnJournalReader.cs
using System;
using System.IO;
using System.Runtime.InteropServices;
public class UsnJournalReader
{
[DllImport("kernel32.dll", SetLastError = true)]
public static extern IntPtr CreateFile(
string lpFileName,
uint dwDesiredAccess,
uint dwShareMode,
IntPtr lpSecurityAttributes,
uint dwCreationDisposition,
uint dwFlagsAndAttributes,
IntPtr hTemplateFile);
// Additional P/Invoke declarations would go here
// Full implementation would process the USN journal entries
// to detect newly added files
}
- Schedule the PowerShell script as a daily task using Windows Task Scheduler
- For large folders, consider adding:
- File extension filters
- Size-based retention rules
- Email notifications before deletion
- Test thoroughly with different file operations (drag-drop, copy/paste, robocopy moves)
The hashing operation can be CPU-intensive. These optimizations help:
# Fast hash alternative for large files
function Get-FastHash {
param([string]$filePath)
$stream = [IO.File]::OpenRead($filePath)
$hash = [System.Security.Cryptography.MD5]::Create().ComputeHash($stream)
$stream.Close()
[BitConverter]::ToString($hash).Replace("-","")
}